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Brief Reports 


The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because cf lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 
1. Sends the Brief Report, limited to one printed 


page and prepared according to the specifications 
given below. 

2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 


3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 
charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 

Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 75 lines av- 


eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 75 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 75-line quota: * 

1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. ——, re- 
mitting $—— for microfilm or $—— for photo- 
copies. 

Extended report. Because the extended re- 
port is intended for photoduplication, and is 
not copy to be sent to a printer, its style 
should differ in several ways from that of 
other manuscripts: (a) The extended report 
should be typed with single spacing for 
economy in duplication. (6) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in the style speci- 
fied by the Publication Manual (1). 


Reference 
1. American Psychological Association. Council of 
Editors. Publication manual of the American 
Psychological Association (1957 rev.). Wash- 
ington, D. C.: American Psychological Asso- 
ciation, 1957. 
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Relationship Between Certain Personality Variables 
and Continuation in Psychotherapy’ 


Earl S. Taulbee 
VA Mental Hygiene Clinic, Omaha, Nebraska 


It is obvious from the wide range of vari- 
ables (pertaining to patient, therapist, tech- 
nique, environment, and the interaction of 
these) which have been found to be associated 
with continuation and improvement in psy- 
chotherapy, that the process and outcome of 
treatment are extremely complex. In general, 
studies reported to date offer support for the 
relationship between such factors as socio- 
economic status, education, intelligence, and 
number of interviews to continuation in psy- 
chotherapy and to its outcome. Very little re- 
search has been done on the role of specific 
personality variables in the process of treat- 
ment. Because of the limited facilities avail- 
able for treatment, accurate prognosis is a 
very important problem. 

Cartwright (2) studied the outcome of 78 
clients seen in client-centered therapy. He in- 
terpreted the relation found between success 
ratings and length of therapy as suggesting 
that certain individual differences between 
clients give rise to different kinds of thera- 
peutic process. Two kinds of process were 
identified as: “short” (1-12 interviews) and 
“long” (13-77 interviews). Very similar re- 
sults were reported by Taylor (9) using cases 
seen in a psychoanalytically oriented VA men- 
tal hygiene clinic. These two studies offer 
some evidence that continuation and rated 
improvement in therapy are more a function 
of personality differences in clients than the 
type of therapy employed. Gallagher (4) 
found that the more anxious the individual 
the more likely he would be to remain in 
therapy and the more defensive the individual 
the greater the likelihood of his prematurely 


1From the Veterans Administration Mental Hy- 
giene Clinic, Omzha, Nebraska. 
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leaving therapy. Investigating the differences 
between a group of patients rated improved 
and an unimproved group, Rosenberg (6) 
found that the improved showed higher IQ, 
greater productivity, greater emotional depth 
and responsiveness, more sensitivity and tact, 
and higher energy level and drive for achieve- 
ment. The unimproved showed more stereo- 
typy and more preoccupation with physical 
complaints. Mowrer suggested that “. . . de- 
pression, inferiority feeling, anxiety, and other 
painful affects which the neurotic experiences 
are signs of strength within the personality 
. . an individual with internal disharmony 
who is experiencing these distressing emotions 
is not nearly so badly off as he would be if, 
under the same circumstances, he were bland, 
unconcerned, unmotivated” (5, p. 91). 


Purpose of the Present Study 


It is the author’s opinion that certain 
identifiable personality variables are associ- 
ated with the premature termination of, or 
continuation in, individual psychotherapy. 
Furthermore, regardless of the way in which 
these interact with other important variables 
in the therapeutic relationship, they will be 
reflected in certain psychological tests. The 
following specific hypotheses were tested: 

Hypothesis 1. Certain symptoms and af- 
fects, as revealed by specific MMPI scales 
and Rorschach variables, are positively re- 
lated to continuation in psychotherapy: 

(a) The “continuers” show a greater ele- 
vation of the “symptom” scales of the MMPI 
(Hs, D, Pa, Pt, and Sc) than the “attriters.” 

(6) The continuers give a larger number 
of color (C), shading (VY), vista (V), and 
anatomy (Am) responses on the Rorschach 


i Th 
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Table 1 
Comparison of Continuer and Attriter Groups on MMPI Scales 


Continuers 


SD 


Attriters 


Mean 


1.99 
3.91 
5.16 
15.46 
15.47 
10.21 
12.50 
10.23 
10.39 
15.73 
15.13 
941 
7.03 


8.08 

4.50 

5.13 
12.98 
77.38 
70.43 
70.85 
60.58 
$1.55 
53.83 
65.05 
60.05 
56.25 
40.33 


Note.—Raw scores given on ?, L, F, K, and Es (Ego-strength); T-scores on all other scales. 
* » less than value indicated with one-tailed test for those scales where the direction was predicted; otherwise two-tailed test 


(and a greater number of continuers give such 
responses). 

Hypothesis 2. The continuers are less de- 
fensive as reflected by their rejecting fewer 
cards (and fewer cases rejecting cards) and 
having a lower F% and A%. 

Hypothesis 3. The continuers manifest a 
more persistent mental attitude by giving a 
greater number of total responses (R), and 
a greater number of space responses (S$) (and 
a greater number of cases give S responses). 

Hypothesis 4. The continuers more closely 
resemble a group of “normal” subjects when 
compared on the basis of R, F%, A%, num- 
ber of cards rejected, and total number of S, 
C, Y, V, and An responses, than do the at- 
triters. 


Subjects and Procedure 


The 85 patients were veterans in a men- 
tal hygiene clinic who had been diagnosed 
psychoneurotic without organic complications 
and who had been administered both the Ror- 
schach and MMPI during the intake pro- 
cedure. The patients were divided into two 
groups to conform to the “short” and “long” 
therapeutic processes suggested by Cart- 
wright: (a) 40 subjects who terminated 
treatment prior to the 13th interview, who 
are referred to as the “attriters” (mean age 


33.45, SD 7.52; mean education 10.30 years, 
SD 2.14); (6) 45 subjects who remained in 
treatment for 13 or more interviews, who are 
referred to as the “continuers’”’ (mean age 
33.60, SD 7.25; mean education 10.93 years, 
SD 2.55). The two groups do not differ sig- 
nificantly on mean age and education. 

The 50 normal subjects were residents of 
a small midwestern community * located in 
the same geographical locale from which the 
neurotics came. They were comparable in age 

- (mean age 35.62, SD 7.24) and the mean 
education of the neurotics suggests as well 
that the normals are probably similar in this 
respect based on average education expect- 
ancy. 

All tests were individually administered. 
The Rorschach scoring of the original ex- 
aminer was accepted for all of the normal 
cases as the protocols had been scored ac- 
cording to the Beck system (1) and checked 
by a second person. In the cases of the pa- 
tients where the Rorschach protocols had not 
been scored according to Beck or where the 
records had been scored by psychology 
trainees, identification was removed from the 


2 The records were taken from a larger study done 
under the supervision of Marshall R. Jones and sup- 
ported by the University of Nebraska Research 
Council. 
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MMPI Scale Mean = SD p* 
? 8.27 10.70 08 NS. 
L 3.60: 2.59 1.79 NS. 
F 6.02 3.16 1.13 NS. 
K 12.38 4.89 54 NSS. 
Hs 80.36 11.57 98 NS. 
D 78.29 12.40 2.53 025 
Hy 74.91 8.10 1.99 05 
Pd 59.91 11.46 25 NS. 
Mf 57.96 9.59 2.93 1 
Pa 60.64 10.37 2.98 005 
Pt 73.87 15.32 2.58 025 
; Se 69.58 13.42 3.02 005 
Ma 56.69 9.51 21 NS. 
Es 38.34 5.74 1.40 NS. 
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Table 2 
Comparison of Normal, Continuer, and Attriter Groups on Mean Scores for Certain Rorschach Variables 


Rorschach 
variable 


‘Continuer 
(N = 45) 


Mean SD 


Attriter 
(N = 40) 


Mean SD 


11.69 


15.40** 7.72 
69.50 19.85 
17.18 

1.51 

1.24 


24.84** 
63.98 16.41 
46.20°* = 154/ 
A9* 1.22 
1.51* 1.60 
4.13** 3.01 2.43 
2.$3° 2.40 2.22 
.96** 1.15 40 
2.67** 2.71 1.65 


* Continuers and attriters significantly different at .05 level (one-tailed tests). 
** Continuers and attriters significantly different at .01 level (one-tailed tests). 


records and they were rescored by the author. 
In order to eliminate some of the unreliabil- 
ity of scoring of the Rorschach variables be- 
ing investigated, no attempt was made in 
stating the hypotheses to differentiate be- 
tween the responses determined primarily and 
secondarily by the form element. There was 
simply a count of the responses in which the 
particular determinant was present. 


Results 


Hypothesis 1. Symptoms and affects. The 
results support the hypothesis that the con- 
tinuers show a greater elevation of the symp- 
tom scales (Hs, D, Pa, Pt, and Sc) of the 
MMPI than the attriters. These differences 
were all significant at the .05 level or less 
except for the Hs scale which was in the pre- 
dicted direction. The means and standard 


deviations of the MMPI scales for the two 
groups are reported in Table 1. 

As predicted, the continuers gave more C, 
Y, V, and An responses, and a large number 
of cases gave responses falling in these cate- 
gories. The mean differences between the two 
groups on the number of responses for each 
variable are all significant at the .05 level or 
less as shown in Table 2. Significantly more 
continuers gave C, Y, V, and Am responses 
than did the attriters. The number and per- 
centage of individuals in each group who 
gave these responses, and the chi-square tests 
of differences are reported in Table 3. 

Hypothesis 2. Defensiveness. The continu- 
ers rejected significantly fewer cards, and 
significantly fewer continuers rejected cards, 
and gave a lower A%. The F% was lower in 
the continuers but not significantly so (see 
Tables 2 and 3). 


Table 3 


Comparison of Number and Percentage of Subjects in Normal, Continuer, and Attriter Groups 
Giving Certain Types of Rorschach Responses 


Rorschach 
variable 


Rejecting card(s) 


V 
An (76%) 


Continuer 


Attriter Continuer vs. Attriter 


85 
(N = 50) 
; #R 24.52 12.78 
F% 62.16 15.08 
A% 49.98 14.74 
# Rejections 40 1.00 
#S 1.38 1.66 
#C 4.50 3.22 
#Y 3.04 2.65 
#V 380 1.31 
: #An 2.20 2.26 
Normal | 
9 (18%) 8 (18%) 22 (55%) 12.85 ool 
S 31 (62%) 30 (67%) 19 (48%) 3.19 10 
41 (91%) 28 (70%) 4.73 O05 
34 (76%) 22 (55%) 3.98 05 
24 (53%) 8 (20%) 10.02 O1 
33 (73%) 15 (38%) 11.06 001 
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Tabie 4 
Frequency and Assigned Weights of Prognostic Signs Occurring in the Continuer and Attriter Groups 


Continuers 
Prognostic 


Attriters 


sign Present Absent 


Present Absent 


24 19 


15 


Hy > Pt 
Hy > Se 
Pd > Pa 
Pd > Pt 
Pd > Sc 
Ma> Pa 
Ma> Pt 
Ma> Sc 
Rs15 
58 
Rejections 1 
S% 32 
C% 6 
33 
V% s2 
An% =0 
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10 
6 
10 


Hypothesis 3. Persistence. The continuers 
gave significantly more S$ responses and a 
greater number of total responses (R). The 
difference between the number of continuers 
and attriters giving S responses is significant 


at almost the .05 level. 

Hypothesis 4. The continuers were more 
like the normal subjects on each of the vari- 
ables predicted than they were like the at- 
triters. As was expected, the two former 
groups tended to differ from each other with 
respect to the importance of the “form” ele- 
ment in the responses (e.g., the normal group 
gave more FC and fewer CF and C responses 
than did the continuers, similarly with the 
other variables). 

The continuers gave a significantly larger 
number of total responses than did the at- 
triters; therefore chi squares were computed 
between the two groups on S, C, Y, V, and 
An responses corrected for R. The method of 
correction was that suggested by Cronbach 
(3, p. 411). Each variable was plotted against 
R for the combined groups, and the median 
was used as the dividing point. Then the 
number of cases in each group falling above 
and below the median were compared by chi 
square. The results were all significant at the 
.OS level or better. In addition, the same 
method of correction was applied to the per- 


centage score for each variable. The chi 
squares were all significant at less than the 
O01 level except for S% and Y% which did 
not reach the .05 level. 

In the interest of the practical significance 
of the variables for prognostic purposes, a 
cutoff point for each and for the variables 
combined was sought which would afford the 
maximum differentiation effectiveness of the 
measures. Therefore, an objective configural 
analysis, which has been described in detail 
elsewhere (7, 8), was applied to the MMPI 
profiles of the subjects and there were eight 
scale pairs which differentiated the two 
groups at the .10 level or less. Then the best 
cutoff point for separating the continuers and 
attriters on the basis of the percentage score 
for each Rorschach variable was determined. 
The resulting prognostic signs (eight MMPI 
scale pairs and eight Rorschach variables), 
frequency and significance level of occurrence 
of each, and arbitrary weights assigned for 
prognostic purposes are shown in Table 4. 
A combined prognostic score was determined 
for each patient by summing the weights of 
the signs present. These prognostic scores for 
the 85 patients were arranged in a frequency 
distribution and the most effective upper and 
lower cutoff points determined. An interval 
tabulation of the frequency of prognostic 


86 
28 2.80 .10 
33 3.83 10 
24 30 7.59 01 
35 19 20 7.39 01 
36 2216 13.86 001 
1 28 2.97 10 
37 14 6 4.07 05 
38 16 8.42 01 
36 21219 9.79 01 
37 14.42 001 
37 12.85 001 
1 30 2218 4.04 05 
40 18 10.48 01 
18 3.12 10 
33 7 11.74 001 
25 «15 11.06 001 
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scores occurring in the two groups is shown 
in Table 5. The upper (19-43) and lower 
(0-10) ranges represent the most effective 
cutoff points for differentiating the attriters 
and continuers, respectively. 

In previous unpublished research on the 
outcome of psychotherapy, the author ob- 
tained therapists’ ratings of improvement on 
56 of the cases used in the present study (21 
attriters and 35 continuers). These cases were 
divided into two groups, one group contain- 
ing those who showed limited or no improve- 
ment (all the attriters and 20 or 57% of the 
continuers) and the other group comprised of 
those showing moderate or extensive improve- 
ment (15 or 43% of the continuers). Herein- 
after these will be referred to as the “limited 
improvement” and “moderate improvement” 
groups. A comparison of these two groups 
on mean MMPI profiles and Rorschach vari- 
ables mentioned above reveals relationships 
between these scores 2nd improvement simi- 
lar to those found between these scores and 
continuation in treatment. The only excep- 
tion was that the limited improvement group 
had a higher Hs score than did the moderate 
improvement group (the Hs was not signifi- 
cantly higher in the continuers than in the 
attriters). Patients in the moderate improve- 
ment group tended to have higher scores on 
D, Mf, Pa, Pt, and Sc, and to have a greater 
number of C, Y, V, and Am responses, a 
larger R, and a lower number of rejections 
and A%. Tests of significance were not com- 
puted between the groups because of the small 
number of subjects in the moderately im- 
proved group and the crude rating scale used. 
The writer is well aware of the likelihood of 
contamination in the judgments of the thera- 
pists in that the ratings of improvement may 
have been influenced by the length of time 
the patients remained in treatment. A partial 
check on this was made by comparing the 
limited improvement and the moderate im- 
provement cases of the continuer group only. 
The results reveal a tendency for most of the 
variables to be related in the predicted direc- 
tion to improvement. However, these data 
from the unpublished research should be con- 
sidered only as suggesting a relationship be- 
tween the personality variables studied and 
improvement in psychotherapy. 


Discussion 


The results of this study offer support for 
the hypothesis that there are identifiable per- 
sonality variables which are associated with 
the premature termination of, or continua- 
tion in, individual therapy. Also, there is evi- 
dence to suggest that all or most of these vari- 
ables are related to degree of improvement as 
rated by therapists. Because of the very ho- 
mogeneous nature of the patients used in this 
study, the strength and number of variables 
identified are probably minimal. It is expected 
that with diagnostically more heterogeneous 
groups, such as those used by Cartwright and 
Taylor, these differences would be more pro- 
nounced and other pertinent attributes could 
be identified. 

The author interprets the data to indicate 
that those patients who stay in treatment 
longer, and who are likely to show greater 
improvement, ate considerably more respon- 
sive emotionally to their perceived world than 
are those who drop out prematurely. They are 
sensitive to a wider range of affective stimuli, 
including both those of a more pleasurable 
and painful nature. The continuers are more 
anxious, sensitive, dependent, self-doubting, 
and have increased awareness of feelings of 
inadequacy, inferiority, and depression. They 
possess better potential for self-appraisal, af- 
fective reactivity and have more of a compul- 
sive introspective attitude. The greater need 
of the continuers to be accepted and to re- 
ceive affection is probably a motivating fac- 
tor in the increased interaction with others. 
However, these interactions are in many ways 
immature. The continuers’ stronger feminine 
interests, or less adequate identification with 
the cultural norms of masculinity, and greater 


Table 5 
Frequency and Percentage of Prognostic Scores Occur- 


ring in the Continuer and Attriter Groups 
and Differences Between Groups 


Prognostic 
score Continuers Attriters 
4 (9%) 
16 (36%) 
25 (S6%) 


19-43 
11-18 
0-10 


29 (73%) 
10 (25%) 
1 (3%) 


33.15 
25.35 


87 
p 
<.001 
<.001 
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preoccupation with anatomy may be indica- 
tive of more disturbance in the sexual area. 
The attriters’ not scoring significantly lower 
on the Hs scale as they did on Mf and An 
would be consistent with this view. Many 
clinicians consider An responses to be a screen 
for things more directly sexual in nature. 
There is also a greater tendency for the con- 
tinuers to be more moody, to nurse grudges, 
and to be basically more hostile. 

The repressive defense of the continuers is 
failing and, instead of a strengthening of this 
mechanism as occurs in the attriters, they 
are drawing upon other defenses, particularly 
projection, phobic and obsessive ideation, or 
compulsive behavior. The test pattern of the 
continuers is much more like that usually 
considered to be characteristic of patients 
whose primary symptoms are of a conversion 
or somatization nature than is the attriters’. 
Approximately 60% of both the attriters and 
_ continuers on whom the information was 
available were diagnosed anxiety reactions 
and about 35% of both groups were classi- 
fied as conversion reactions. The continuers 
have an immature attitude toward life, are 
poorly controlled emotionally, have fear and 
guilt related to their sexual functioning, and 
pay a great deal of attention to anatomy. 
Their primary defense of repression, along 
with their tendency to convert emotional con- 
flict into motor or sensory channels, has not 
been effective in alleviating the underlying 
anxiety; therefore, other defenses have been 
resorted to but to a lesser degree. 

In contrast to the continuers, the attriters 
emphasize intellectual control and handle 
situations in an impersonal, matter-of-fact 
way. Repression has generalized to the point 
that they are able to acknowledge or respond 
to a very limited range of emotional stimula- 
tion; consequently, both their social interac- 
tion and thought processes have become very 
stereotyped and barren. The attriters’ relative 
absence of emotional lability and consciously 
felt anxiety, plus a greater withdrawal and a 
more intellectual approach to their problems, 
suggest that their repressive and narcissistic 
defenses and symptoms are working suffi- 
ciently well to enable them to adjust without 
continued treatment, at least temporarily. 

Quantitatively, the continuers and normals 


are very similar on the Rorschach variables 
investigated. However, a qualitative inspec- 
tion of the data reveals that the continuers 
are less well integrated and manifest a more 
immature responsiveness. 

These findings are consistent with the gen- 
erally accepted indicators of better risks for 
therapy and with the results of the studies 
mentioned. There is much to suggest that 
feelings of anxiety, inadequacy, depression, 
dependency, and the potential for self-evalua- 
tion and affective responsiveness are signs of 
strength within the personality. While the 
continuers tend to have more vague somatic 
symptoms, an inspection of the improvement 
ratings reveals that there is less improvement 
in those patients who are extremely preoc- 
cupied with hypochondriacal complaints. The 
exaggerated physical concern would appear 
to be a defense against accepting the psycho- 
genic etiology of their symptoms. 


Summary 


This study investigated the hypothesis that 
there are certain identifiable personality vari- 
ables which are related to therapy prognosis. 
A group of neurotics who remained in treat- 
ment for 13 or more interviews and a group 
who terminated prior to the 13th interview 
were compared on the basis of certain Ror- 
schach scoring categories and MMPI scales. 
Both of these groups were compared to a 
group of normal subjects on the Rorschach 
variables. As predicted, the continuers were 
less defensive, and more persistent, anxious, 
sensitive, and dependent than the attriters. 
They possessed an increased consciousness of 
feelings of inadequacy, inferiority, and de- 
pression and had better potential for self-ap- 
praisal, emotional responsiveness, and more 
of an introspective attitude. There is evidence 
to suggest that these personality variables are 
associated with improvement as well as with 
continuation in therapy. The continuers re- 
sembled the normals more than they did the 
attriters on the Rorschach measures. The re- 
sults support the contention that painful af- 
fects which the neurotic experiences are signs 
of strength within the personality. 

Prognostic scores are suggested for differ- 
entiating the continuers from the attriters. 
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However, the scores have not been cross- 
validated and should be used very cautiously. 
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A Case of Folie a Deux and Projective Techniques 


Edward M. Scott 
Eastern Oregon State Hospital 


Folie a Deux is mentioned only briefly in 
the literature. Psychological testing has been 
reported in but two articles. Another case on 
such an intriguing topic should be of some 
interest. 

Recently committeed to the Eastern Oregon 
State Hospital were two 50-year-old, prob- 
ably identical, twin brothers. They had never 
married and had a “spotty” work record. The 
hospital staff diagnosed them as paranoid 
schizophrenics, with folie a deux condition. 
“A,” the dominant twin, became psychotic 
and influenced “B” into the same condition. 
A brief summary of the psychological battery 
is given: 

On the Wechsler-Bellevue Intelligence Scale: 


“Aa” 
Verbal Scale 1Q 105 105 
Performance Scale IQ 99 94 
Full Scale IQ 103 100 


On the Rorschach, “A” gave 13 responses, 
with 11 F-responses, and 11 perceptions that 
were sexual in nature. “B” gave 21 responses, 
with 7 F-responses and * that were sexual in 
content. Many of their perceptions appeared 
to be fundamentally of the same psychopatho- 
logical nature. For example, on Card 9, “A” 
said, “Nasty looking. Part of a woman. May- 
be the lower part”; while “B” said, “Oh! Un- 
clean. Just the upper part. Most women are 
in the lower part.” 


Their TAT responses appeared to show 
similar dynamic trends. It is difficult to show 
a typical theme; perhaps Card 14 is signifi- 
cant. “A” responded, “Somebody looking out 
the window. Ooh! Terrible. Outside is free- 
dom. That’s a lot better. Somebody penned 
him up. He hurt the world. Gave it a dis- 
ease.” “B” responded to the same card, “Ooh! 
Must be waiting for somebody, all alone. Re- 
morse, guilty, committed murder or killing. 
Maybe his past life caused him to do that. 
He'll have to pay. Probably go mad.” 

“A” earned a higher score, 92, than “B’s” 
81 on the Bender-Gestalt. 

The projective tests indicated that “A” was 
more deteriorated yet a bit more intellectual. 
Using Grover’s (2) and Gralnick’s (1) divi- 
sion of folie a deux, this case appears to be 
one of folie impasse, although some might 
hold to folie communique. Both patients re- 
sponded to electroshock therapy and were dis- 
charged shortly thereafter. 

Psychological testing seems to be of defi- 
nite value in folie a deux. 
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The Prognostic Significance of Certain 
Behavioral Variables 


A. Eskey and Ira Friedman 
Cleveland Receiving Hospital and State Institute of Psychiatry 


In a previous study (3) it was reported that 
disoriented mental patients do not improve 
more rapidly than oriented patients. This runs 
contrary to the expectation that severe symp- 
toms reflecting personality disorganization are 
often more refractory than are milder overt 
symptoms. The authors suggest that disori- 
entation may not be “sufficiently representa- 
tive of ‘acute onset . . . confusion and atypi- 
cal symptoms’ (1) to yield consistent results 
in the anticipated direction.” It was suggested 
that other variables be investigated which 
might be considered more reflective of this 
symptom picture. The present study is an ap- 
proach to this problem. Specifically, it repre- 
sents an attempt to investigate the relation- 
ship of verbal activity, motor activity, and 
quality of thinking to length of hospitaliza- 
tion in order to evaluate whether these vari- 
ables are related to rapidity of improvement. 


Method 


Two hundred randomly selected psychotic 
patients, male and female, between the ages 
of 16 and 59 were included in the study. This 
group exhibited a wide range of motor and 
verbal activity as well as varying degrees of 
intactness and disorganization of thought 
processes. The following criteria were used to 
exclude patients from the study: those who 
remained in the hospital for less than two 
weeks, those with any type of organic brain 
damage, alcoholics, mental defectives, chronic 
patients transferred to “custodial” institu- 
tions, those discharged as unimproved, pa- 
tients leaving against medical advice, cases 
where language difficulties made communica- 
tion unreliable, and patients who were under 
heavy sedation. 


Each patient was rated on motor activity, 
verbal activity, and quality of thinking. These 
ratings were based upon information derived 
from mental status examinations obtained on 
each patient at the time of hospital admis- 
sion. Motor and verbal activity were rated on 
a nine-point scale which ranged from under- 
activity to overactivity (two-tailed) while 
thinking was rated on a five-point scale rang- 
ing from intact to disorganized thinking (one- 
tailed). The scales are presented in Table 1. 

Since one of the Es was the sole rater in 
the experiment proper, a reliability study was 
undertaken in order to ascertain whether other 
judges could reliably estimate motor activity, 
verbal activity, and quality of thinking from 
a report of a mental status examination. Ten 
randomly selected patients were rated by five 
judges, including E. The reliability of the av- 
erage ratings computed by Ebel’s (2) appli- 
cation of the Spearman-Brown formula to the 
formula for the reliability of individual rat- 
ings set forth by Snedecor (5), yields the fol- 
lowing reliability coefficients: motor activity 
r = .98, verbal activity r = .98, and quality 
of thinking r = .96. 

Since all the patients included in the study 
were discharged as improved or recovered, the 
prognostic significance of the variables were 
derived from relating them to length of hos- 
pitalization. From an operational point of 
view then, prognosis refers to the rapidity of 
improvement or recovery rather than to de- 
gree or quality of improvement. Comparisons 
were made between groups showing high, low, 
and normal verbal and motor activity, as well 
as those displaying intact, as opposed te dis- 
torted or disorganized, thought processes in 
terms of length of hospitalization. The ¢ test, 
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Table 1 
Scales for Verbal Activity, Motor Activity, and Quality of Thinking 


Criterion 


Verbal Activity Scale 


Patient is extremely circumstantial; talks almost constantly. Little or no pause in his conversation. Minimal 
(or no) attention paid to examiner's questions (or presence). 

Patient is definitely overtalkative. Noticeable ‘‘push"’ behind speech but examiner is still largely able to direct 
the course of the interview. 

Speech is noticeably more ‘‘loose”’ than normal. Some evidence of circumstantiality. 

Patient is slightly overtalkative, may be mildly expansive. 

Normal flow of speech. Patient enters freely into discussion with the examiner. 

Speech is mildly inhibited but no real constriction is present. Speech is still largely spontaneous. 

Patient does not communicate freely; some blocking may be noted. May answer examiner's questions reluc- 
tantly or tangentially. 

Speech is definitely constricted but communication still takes place. Patient may be “‘holding back" relevant 
information. Little attempt to initiate or maintain conversation. 

Minimal (or no) communication. Speech highly constricted; no pertinent information is elicited. May be mute. 


Motor Activity Scale 


Patient is hyperactive, restless, and unable to sit still. In constant or near constant movement. May be in 
restraints. 

Definite signs of excitement are present. Patient is clearly overactive but still able to exert some control over 
his activity. 

Mild signs of agitation are present. Patient is more than normally active but in no respect exhibits a real loss 
of control. 

Patient may be slightly restless or fidgety, e.g., tapping his foot, drumming fingers, shifting uneasily in his 
chair, etc. 

Normal degree of motor activity. No evidence of retardation or agitation. 

Patient may appear to be slightly listless or lacking in energy but is not actually retarded nor apathetic. 

A noticeable degree of apathy or retardation is present. General motor activity is reduced but patient still 
exhibits some freedom of movement. 

Physical movements are definitely retarded. Movements are slow and effortful. 

Little or no physical movement is present. Patient is quiet or sits stiffly in chair. May be rigid. 


Quality of Thinking Scale 


Thought processes appear to be intact. No evidence of illogical or distorted thinking. 

Minor deviations from orderly thinking noted. No gross misinterpretation of reality is present. Some fluidity 
or looseness of thinking may be noted. 

Definite evidence of disturbed thinking. Logical inc i ies are app Del 1 ideas and/or halluci- 
nations may be present. There may be some doubt that his e experiences are based on reality. 

Reality testing is obviously impaired but patient still has some contact with reality. Remnants of logical 
thinking are still present but in general the patient ignores or is unaware of logical inc istencies. Delusi 
and/or hallucinations present and are accepted without question by the patient. 

Patient exhibits extremely confused and bizarre thinking. Little or no appreciation of reality is present. 
Thoughts are so fragmented, or communication is so scant as to make more exact estimation of the degree 
of control impossible. 


utilizing medians, was employed to calculate 
the significance of the differences. 


to ascertain the significance of the differences 
in length of hospitalization. The extremely ac- 
tive (group rated 9, and groups rated 9 and 


Results 8 combined) and the extremely underactive 


Each patient was rated for motor activity, 
verbal activity, and thinking in accordance 
with the descriptive statements of the scales 
in Table 1. The distribution of the patient 
population on the various scale categories is 
indicated in Table 2. 

The patients were grouped on verbal and 
motor activity in terms of overactivity (high, 
rated 7, 8, or 9), underactivity (low, rated 1, 
2, or 3), and normal activity (middle, rated 
4, 5, or 6). These groups, differing in degree 
of activity, were compared with one another 


(group rated 1, and groups rated 1 and 2 
combined) were also compared with one an- 
other and with the prototypic normally active 
group (those rated 5). The ¢ test was used to 
test for the significance of the differences in 
length of hospitalization and the results are 
presented in Table 3. 

In reference to thinking, the same pro- 
cedure was followed but somewhat different 
groups were combined because the scale is 
one-tailed rather than two. In this case, dif- 
ferences in length of hospitalization were com- 
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Scale 
Value 
9 
8 
7 
6 
5 
4 
3 
2 
1 
9 
7 
6 
4 
3 
2 
1 
1 
2 
3 
4 


Prognostic Significance of Behavioral Variables 


Table 2 
Frequency Distribution of Patients on the Three Scales 


Scale Value 


Scale 


Motor activity 
Speech activity 
Thinking 


puted only for those groups rated 1 and 2 as 
compared to those rated 3, 4, and 5 combined 
and to those rated 5 alone. Because the N is 
so small in the combined group rated 1 and 
2, a comparison was made between 1, 2, and 
3 as contrasted to 4 and 5. These results are 
also presented in Table 3. 

The results summarized in Table 3 reveal 
that, in the areas of motor and verbal ac- 
tivity, there are no significant differences in 
length of hospitalization. This holds true when 
comparing normally active with both overac- 
tive and underactive and when comparing ex- 
tremes of activity with one another. An addi- 
tional check was made in the area of motor 
activity by selecting those patients who were 
placed in restraints and comparing them with 
an equal number of randomly selected pa- 
tients displaying a normal degree of motor 
activity. In this comparison, the two groups 
(N = 38) did not differ significantly in terms 
of median length of hospitalization (¢ = .19, 


p = .85). These results indicate that neither 
motor nor verbal activity are valid prognostic 
indicators of rapidity of improvement. 
From Table 3 it may also be observed that 
significant differences are obtained when com- 
paring patients who reveal normal or rela- 
tively normal thinking with those who dis- 
play distorted or disorganized thinking. These 
differences are not in the anticipated direc- 
tion, however, and it appears that those pa- 
tients who have little or no disturbance of 
thought processes are more likely to be dis- 
charged sooner. Unfortunately, the number of 
cases displaying adequacy of thought proc- 
esses is small (NV = 13) so that the general- 
ity of conclusion is limited. When the group 
rated three on thinking, i.e., somewhat patho- 
logical, was included along with those with 
more normal thinking processes and compared 
with patients displaying greater thought pa- 
thology, the results failed to reach acceptable 
levels of significance. However, even here, a 


Table 3 


Significance of Differences in Length of Hospitalization Between Groups Differing in Scale Ratings 


Scale Ratings 
of Comparison 
Groups t 


Verbal Activity 


Moior Activity 


Quality of Thinking 


123-789 
123-456 
456-789 
12-89 
12-5 
5-89 
1-9 
1-5 
5-9 
12-345 
123-45 
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|| 
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tendency for more rapid discharge of patients 
whose thinking was more intact upon admis- 
sion was noted. 

As a final check on the data, the upper 
quarter of each group was compared with the 
lower quarter by means of the Mann-Whitney 
U test (4). In this comparison, only thinking 
proved significant (p = .02), whereas motor 
activity (p= .33) and verbal activity (p 
= .23) provéd nonsignificant as in the pre- 
vious tests. This nonparametric technique was 
used not only because of nonnormal distribu- 
tion of the data, but also to give more appro- 
priate weight to extreme scores which tend to 
be underweighted with medians and over- 
weighted using means. 


Discussion 

Overactivity and underactivity in verbal 
and motor behavior and a greater disorgani- 
zation of thought processes do not appear to 
be associated with more rapid improvement 
as measured by length of hospitalization. 
In the area of thinking, as a matter of fact, 
the less the disorganization of thought proc- 
esses at the time of admission the greater 
is the tendency for more rapid discharge. 
These findings shed doubt upon the generali- 
zation that more extreme symptomatology re- 
sponds more rapidly to treatment. 

In a previous article (3), the authors re- 
lated the impression that severe symptoma- 
tology is more refractory than milder overt 
symptomatology to Gestalt memory principles 
of similarity and contrast, i.e., that we tend to 
recall those patients whose final state differs 
markedly from their condition upon admission. 
The present study suggests that improvement, 
sufficient to warrant discharge, is nondifferen- 
tial between patients displaying extreme or 
mild symptomatology in temporal terms. It is 
important to point out clearly, however, that 
no attempts were made to examine the qualita- 
tive degree of improvement. Since all patients 
were discharged as improved, we might hy- 
pothesize that attempts to measure qualita- 
tive changes between condition upon admis- 
sion and at discharge would yield greater 
change in those patients who initially dis- 
played more deviant behavior. We do not 
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know, however, whether the discharge condi- 
tion of patients differing upon admission in 
terms of severity of symptomatology would 
yield differences in terms of some absolute 
criteria of “normality.” It appears that the 
evaluation of the degree and nature of quali- 
tative change, and the comparison of dis- 
charge condition to some absolute criteria of 
“normality” is a fruitful area for further re- 
search to shed more light on this problem. 
Replication of the present study and further 
studies investigating other symptom pictures 
are also necessary. 


Summary 


Two hundred psychotic patients discharged 
as improved were rated on verbal activity, 
motor activity, and quality of thinking dis- 
played at the time of hospital admission. 
Comparisons were made to determine whether 
differences in degree of overt symptomatology 
were associated with rapidity of improvement 
as measured by length of hospitalization. The 
results indicate that degree of activity is not 
associated with rapidity of improvement but 
that patients with more intact thought proc- 
esses at the time of admission tend to be dis- 
charged more rapidly. The findings shed doubt 
upon the generalization that more extreme 
symptomatology responds more rapidly to 
treatmer‘. Differences in viewing improve- 
ment in temporal and qualitative terms were 
discussed in reference to suggestions for fur- 
ther research. 
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Self Acceptance and Marital Happiness’ 


Daniel Eastman 
Teachers College, Columbia University 


Previous researches on the personality cor- 
relates of marital happiness or adjustment 
have tended to rely on heterogenous and ec- 
lectic personality measures of limited theo- 
retical and clinical significance (2, 3, 6, 7, 
12). In the present study, self acceptance 
was chosen as a personality measure for in- 
vestigation because it is a relatively homog- 
enous variable based on a developed theory 
of personality with explicit clinical implica- 
tions (8, 9, 10, 11, 15). 

Reduced to its simplest terms, the experi- 
ment consisted in scoring a sample of mar- 
ried couples on two questionnaires, one for 
marital happiness (MH) and one for self ac- 
ceptance, and correlating the two sets of 
scores. This simple scheme was complicated 
by further investigating: (a) the relation of 
MH to the additional personality variables, 
acceptance of others and psychological status; 
(5) the relation of MH to all the personality 
variables in the subjects’ mates as well as in 
the subjects; (c) the relation of MH to the 
mates’ MH, that is, the role of mutual hap- 
piness; and finally, (d) the effect of sex dif- 
ferences on the correlations between MH and 
the personality variables. 


Method 


Marital happiness was measured with the 
instrument constructed by Wallace (14), in 
essence a compilation of the most discriminat- 


1This paper is based upon a dissertation sub- 
mitted to the Department of Family Life, Teachers 
College, Columbia University, in partial fulfilment of 
the requirements for the degree of doctor of philoso- 
phy; committee members, Ernest G. Osborne, chair- 
man; Arthur T. Jersild and Laurance F. Shaffer. 
The original report (4) may be consulted for more 
detailed treatment of the rationale, methods, and 
results. 
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ing items in the Terman (12) and Burgess 
and Cottrell (2) questionnaires. 

Self acceptance (s) was measured as the 
discrepancy between Ss’ ratings of self and 
self ideal, using an abbreviated form of the 
rating scales developed by Bills, Vance, and 
McLean (1). Acceptance of men (m) and of 
women (w) was measured on the same scales 
as the discrepancy between ratings of men and 
women in general, and self ideal; acceptance 
of others (0) being then taken as the mean of 
acceptance of men plus acceptance of women 
(m/2 + w/2). The variable called psychologi- 
cal status was calculated as the signed differ- 
ence between acceptance of men, women, or 
others, and self acceptance (m-s, w-s, or 0-s). 
Psychological status corresponded to what are 
sometimes called feelings of superiority or in- 
feriority. 

Ss were gathered in groups of five or more 
couples brought together by colleagues osten- 
sibly for social purposes. All couples ir every 
group agreed to fill out the questionnzires. In 
order to encourage frankness, the question- 
naires were anonymous, and couples were 
separated before filling them out. Each couple 
was required to put its completed question- 
naires in the same manila envelope before re- 
turning them, so that the investigator was 
able to determine married pairs, while pre- 
serving anonymity and interspouse secrecy. 
When some eight spoiled blanks were eliri- 
nated, the sample consisted of 50 couples 
married two years or more, and 14 couples 
married less than two years. 


Results 


Table 1 shows the means and standard de- 
viations of all the variables for the sample of 
50 couples married more than two years. The 
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Table 1 


Marital Happiness and All Variables, by Sexes, for 
50 Couples Married More Than Two Years 
(Means and standard deviations) 


Husbands Wives 
(N = 50) (N = 50) 
Variable Mean SD Mean SD 
Marital happiness 108.4 25.49 112.6 27.15 
Self acceptance 36.2 15.97 364 14.66 
Acceptance of men 46.8 10.97 45.0 14.90 
Accept. of women 51.7 10.00 51.1 16.50 
Status re men 10.6 15.90 8.6 17.60 
Status re women 15.5 14.53 14.7 17.69 


split-half reliabilities corrected by the Spear- 
man-Brown formula were: for self accept- 
ance, .89; for acceptance of men, .82; for 
acceptance of women, .90. Wallace reported 
a corrected reliability of .90 for his test of 
marital happiness. 

The means for all the variables reported 
here were not significantly different from 
those obtained in other investigations with 
the same questionnaires in clinically “normal” 
groups (4). 

The chief point to be noted in Table 1 is 
the complete lack of any significant difference 
in means between the husbands and wives in 
the sample on any of the variables measured. 
This lack of mean difference between sex 
groups was confirmed in another sample not 
reported here (4). Under these circumstances, 
any observed difference in the marital per- 
formance of the two sex groups could not be 
explained on the basis of different average en- 
dowments of the two sexes. 

It may be noted that both husbands and 
wives scored women in general significantly 
less acceptable than men in general, a bias 
also confirmed for both sex groups in another 
sample not reported here (4). Statistical tests 
in the further course of the investigation 
showed this bias to have no significant cor- 
relation with marital happiness in either hus- 
bands or wives or their mates. 


Marital Happiness and Self Acceptance 


Table 2 shows all the zero-order correla- 
tions between MH and self acceptance. MH 


was significantly related to self acceptance 
both in the Ss themselves and in their mates. 
If sex groupings were combined, the correla- 
tion between MH and the Ss’ own self ac- 
ceptance was significant at the .0005 level 
and between MH and mates’ self acceptance 
at the .0001 level. 

The correlation between the husbands’ MH 
and their wives’ self acceptance was the high- 
est obtained for self acceptance in a single 
mate. This higher correlation was probably 
not the result of accident. In the sample of 
14 couples married less than two years, the 
correlation between husbands’ MH and wives’ 
self acceptance was .53, the only significant 
correlation with self acceptance obtained in 
that small sample. 

The correlation of MH with the summed 
self acceptance of both mates was consider- 
ably higher than the correlation with either 
mates’ self acceptance alone, in part because 
the mates’ self acceptance scores were largely 
independent of each other (r = .24). About 
twenty-five per cent of the variance in MH 
was accounted for by the summed self accept- 
ance of both mates; a figure that compares 
favorably with the percentage of MH ac- 
counted for by any other single variable, or 
any other combination of variables, previously 
investigated. It should be pointed out, how- 
ever, that the variables employed in previous 
investigations were not summed for both 
mates, since the investigators either did not, 
or could not, obtain a significant correlation 
between MH and mates’ scores on the vari- 
ables employed. 


Table 2 


Correlation of Marital Happiness with Subjects’ and 
Mates’ Self Acceptance, by Sexes, for 50 Couples 
Married More Than Two Years 


Husbands Wives 
(VN = 50) (N = 50) 
Marital Happiness — 
Correlated with— r ? r p 
Self acceptance 37.005 35 
Mates’ self acceptance 47 .0005 35 Ol 
Self acceptance of both 
mates 53.0005 44 005 
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Table 3 


Correlation of Marital Happiness with Other Variables, 
by Sexes, for 50 Couples Married More 
Than Two Years 


Husbands 
(N = 50) 


Marital Happiness —————— 
Correlated with— p 


Acceptance of 
Men 
Women 
Others 

Status re 


Men 
Women 


Others 


07 
025 —01 
005 05 


Note.—The score for acceptance of others was the mean of 
the scores for acceptance of men plus acceptance of women. 


Marital Happiness and the Other Variables 


Table 3 shows the correlations between MH 
and the Ss’ own scores for acceptance of others 
and for psychological status; Table 4 shows 
the correlations between MH and the mates’ 
scores for acceptance of others and psycho- 
logical status. 

It will be seen that the MH of both hus- 
bands and wives correlated significantly with 
the wives’ acceptance of others, but not with 
the husbands’ acceptance of others; and that 
the MH of both husbands and wives corre- 
lated significantly with the husbands’ psy- 
chological status but not with the wives’ psy- 
chological status. Acceptance of others was 
maritally operative only in the wives, psy- 
chological status only in the husbands. 

In view of the double check afforded by 
separate scores for acceptance of men (m) 
and acceptance of women (w), and of the fur- 
ther double check afforded by cross validation 
between Ss and mates, it seems fairly safe to 
suppose that the sex pattern revealed in these 
correlations exists in the population of couples 
married more than two years. 


Marital Happiness and Mutual Happiness 


The correlation between the MH scores of 
the husbands and their wives was .74 for the 
50 couples in the present sample. This figure 


appears to indicate a considerable dependence 
of MH on mates’ marital happiness, that is, 
a high degree of mutuality in MH. 

However, the figure .74 was unreliable be- 
cause the MH test itself is partly composed 
of items that deal with objective facts, such 
as the amount of disagreement between the 
mates. The correlation may therefore have 
represented in part nothing more than the 
tendency of the mates to see objective facts 
about their marriage in the same way. Never- 
theless the positive correlation between the 
MH scores of the mates obtained in this 
study, and in every other study using similar 
instruments, probably does indicate some real 
interdependence of MH, that is, real mutual- 
ity, as Terman supposes (12, pp. 80-83). 

In Table 5, partial correlations are shown 
between MH and self acceptance where mates’ 
MH was held constant, that is, where all the 
mutuality, whatever its unknown size, was 
eliminated statistically. It may be seen in the 
first row of the table that under these condi- 
tions MH showed no significant correlation 
with Ss’ own self acceptance in either sex 
group, in contrast to the highly significant cor- 
relations obtained earlier when mutuality was 
not eliminated. 

Apparently, Ss’ own self acceptance is re- 
lated only to that portion of MH which is 


Table 4 


Correlation of Marital Happiness with Mates’ Scores 
on Other Variables, by Sexes, for 50 Couples 
Married More Than Two Years 


Husbands 
(N = 50) 
Marital Happiness 
Correlated with— r p 


Mates’ accept. of 


Men 31 
Women .28 
Others 31 


Mates’ status re 


Men 13 30 
Women 13 37 005 
Others 35 O1 


Note.—The score for acceptance of others was the mean of 
the scores for acceptance of men plus acceptance of women. 


(N = 50) 
01 26 05 
12 32 
07 32 025 
: 
32 
36 
— — 
Wives 
(N = SO) 
025 .08 
025 03 
025 06 
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Table 5 
Partial Correlation of Marital Happiness with 
Self Acceptance, by Sexes, Holding Mates’ 
Happiness Constant, for 50 Couples 
Married More Than Two Years 


Husbands 
(N = 50) 


Wives 
(N = 50) 
Marital Happiness 


Correlated with— r p 


Mates’ self 
Mates’ self 
acceptance* 


025 —.10 


* Indicates second-order partial correlation holding Ss’ self 
acceptance constant as well as mates’ marital happiness. 


mutual; and is probably highly related to it, 
if it could be separated from MH in general. 

In the second row of Table 5 it will be seen 
that the husbands’ MH continued to have a 
significant correlation with their wives’ self 
acceptance, regardless of the wives’ own hap- 
piness in the marriage; and in the third row 
the same relation held even when the hus- 
band’s own self acceptance was also held 
constant. Some unknown factor in the wives, 
not present in the husbands, caused the wives 
to make their husbands happier, regardless of 
their own happiness. 


Marital Happiness and Sex Differences 


The data bearing on the relation of MH to 
sex differences have now all been presented, 
and may be summarized as follows: (a) the 
husbands’ MH showed a markedly higher 
zero-order correlation with their wives’ self 
acceptance; (5) this correlation remained 
uniquely significant even when the wives’ own 
happiness was held constant by partial corre- 
lation; (c) both mates’ MH correlated with 
acceptance of others in the wives only; (d) 
both mates’ MH correlated with psychologi- 
cal status in the husbands only; and (e) the 
MH of both husbands and wives was unaf- 
fected by the tendency of both sexes to rate 
women in general less acceptable than men. 

In short, the wives were more influential in 
marriage, more persistent, less concerned with 
interpersonal status, and indifferent to sexual 
inferiority. There is considerable probability 


that this situation obtains among the popula- 
tion of wives married more than two years. 


The Newlywed Sample 


The data for the sample of 14 couples mar- 
ried less than two years are not presented in 
detail. It was assumed that the relatively un- 
settled state of the criterion, marital happi- 
ness, would tend to depress the correlations, 
and this proved to be the case in general. For 
instance, the interspouse correlation for MH, 
which was .74 in the older marriages, was 
only a nonsignificant 359 in the newer mar- 
riages. There was evidence of a somewhat dif- 
ferent pattern of correlations in the new mar- 
riages; for instance, the husbands appeared 
to be more dependent on their wives’ person- 
ality traits than the wives on their husbands’ 
personality traits. But the sample was too 
small, and too few of the correlations signifi- 
cant, to merit further description of the re- 
sults. 


Discussion 


The question naturally arises whether the 
correlation between marital happiness and 
self acceptance obtained in this study indi- 
cates that self acceptance causes MH, or MH 
causes self acceptance. This is a moot ques- 
tion. Most theoreticians of the self believe 
self acceptance to be the outcome, in large 
measure, of childhood experiences. If this is 
the case, then the results in the present re- 
search would seem highly compatible with 
Terman’s finding, supported by every subse- 
quent investigation, that childhood back- 
ground is one oi the most discriminating fac- 
tors for marital happiness (12, p. 372). Self 
acceptance may be the variable intervening 
in time between childhood and marriage. 

In order to make a conclusive test of the 
causal relation, however, it would be neces- 
sary to obtain self acceptance scores before 
the marriages, which was not done. In support 
of the causal effect of personality in general 
the studies of Kelly (4), Burgess and Wallin 
(3, p. 536), and Terman and Oden (13, pp. 
257-260) may be cited, where in each in- 
stance a personality measure before marriage, 
or divorce, correlated with MH after mar- 
riage, or with divorce as a criterion. 


Self Acceptance and Marital Happiness 


The finding in this study that wives con- 
tinue to influence their husbands’ MH re- 
gardless of their own MH is compatible with 
the conclusions of Burgess and Cottrell (2, 
pp. 341-349), who cite it as one of their six 
main conclusions that American wives make 
the major adjustment in marriage. Making 
adjustments would seem to imply greater 
concern with and influence on the mate, since 
it is the mate that is adjusted to, in part. 

The failure of MH to correlate with the 
positive social trait, acceptance of others in 
the husbands was unexpected, but probably 
accounted for by the fact that MH did cor- 
relate with the husbands’ psychological status. 
Psychological status included acceptance of 
others, from which it was derived mathemati- 
cally. Apparently, some third factor in hus- 
bands, perhaps competitiveness, makes status 
more important to them than to their wives. 


Summary 


With a sample of 50 couples married more 
than two years it was shown that: 

Marital happiness is related to self accept- 
ance, acceptance of others, and psychological 
status in both subjects and their mates; to 
self acceptance in both sexes, to acceptance 
of others probably only in wives, and to psy- 
chological status probably only in husbands. 

The relation of marital happiness to self 
acceptance, acceptance of others, and psycho- 
logical status is affected in several other meas- 
urable ways by average psychological differ- 
ences between the two sexes. 


Received April 24, 1957. 
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The Social Desirability Factor in Edwards’ PPS’ 


Daniel Kelleher 


University of Washington 


Edwards has published a forced choice self- 
assessment inventory, Personality Preference 
Schedule (PPS), designed to give scores on 
personality variables that are free from the 
effects of social desirability (1). The inven- 
tory consists of 225 item pairs, each pair 
made up of two items rated by college stu- 
dent judges to have similar social desirability 
values. It has been shown that correlations 
between scores on the 15 variables and social 
desirability are low (1). The present study 
was designed to see whether social desirabil- 
ity still operated on choice of item within 
item-pairs, however. 

One hundred and one males and 101 fe- 
males from beginning psychology courses were 
given the PPS and the Social Desirability 
(SD) scale. The latter scale, described by 
Edwards (1), yields scores measuring the ex- 
tent to which the subjects endorse statements 
in a socially desirable way. Point-biserial cor- 
relations were computed separately for each 
sex between SD scale score and choice of A 
or B for each of the 210 different item pairs. 
These correlations were found to cluster 


1An extended report of this study may be ob- 
tained without charge from Daniel Kelleher, 1201 
Campus Parkway, Seattle, Wash., or for a fee from 
the American Documentation Institute. Order Docu- 
ment No. 5469, remitting $1.75 for microfilm or $2.50 
for photocopies. 


around zero, with only slightly greater than 
chance occurrence of significant correlations. 

There are 48 item-pairs in which the dif- 
ference between judged social desirability 
rating of the items in the pair is relatively 
great. Correlations were computed so that if 
social desirability was operating in item 
choice, this group of 48 pairs would show 
preponderantly plus correlations. No such ef- 
fect was found. 

A special score, HSD score, was computed 
by counting the number of times the subject 
chose the item in a pair with the higher so- 
cial desirability rating, no matter how small 
the difference (all 210 pairs used). The mean 
HSD score for both males and females was 
higher than that expected by chance at the 
01 level of confidence. Correlations between 
SD scale scores and HSD scores for each of 
the sexes showed insignificant correlations for 
both. 

On the basis of the above results, it was 
felt social desirability played an insignificant 
role in item responses on the PPS. 

Brief Report. 
Received November 13, 1957. 
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Six Measures of Self-Concept Discrepancy and 
Instability: Their Interrelations, Reliability, and 
Relations to Other Personality Measures’ 


Gene Marshall Smith 
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During recent years, largely as a result of 
the emphasis placed on the self concept by 
Rogers (19, 20), many experimental exami- 
nations of self-regarding attitudes have been 
reported (e.g., 1, 2, 3, 7, 18). One measure 
which has frequently been studied is the dis- 
crepancy between the “Self” and the “Ideal 
Self,” i.e., the discrepancy between what the 
individual says he is and what he says he 
would like to be. The present paper deals with 
this measure and five new measures of self 
concept. Three of the five new measures deal 
with the instability, over time, of the con- 
cepts of “Self,” “Ideal Self,’ and “Social 
Self.” In this report, the concept of “Social 
Self” refers to the individual’s notion of how 
he is seen by other people. The remaining two 
measures deal with the discrepancy, at a given 
point in time, between “Ideal Self’’ and “So- 
cial Self” and the discrepancy between “Self” 
and “Social Self.” 

The present investigation was not de- 
signed to test specific hypotheses involving 
the six self-concept measures mentioned above. 


1 This report is based on work carried out during 
1955 and 1956, supported by grants to Henry K. 
Beecher, from the Medical Research and Develop- 
ment Board of the Department of the Army and 
the United States Public Health Service. A pilot 
study investigating the six self-concept variables, 
carried out in 1953 at the University of Rochester, 
was supported by a grant to G. R. Wendt, from the 
Office of Naval Research. The author wishes to ex- 
press his appreciation to R. L. Burke and J. M. 
Hoffman for their assistance with the collection and 
statistical examination of the data, and to those col- 
leagues whose critical reading greatly improved the 
manuscript: E. L. Cowen, R. B. Cattell, W. Finegar, 
D. Briggs, and A. Couch. 


Rather, it was designed to determine the in- 
terrelations among these six measures, to 
evaluate their reliability, and to explore the 
relations between them and certain other 
measures of personality. The six self-concept 
measures are given operational meaning in 
the method section and are given psychologi- 
cal meaning in the discussion section. 


Method 
Self-Concept Test 


Twenty-four college men were given a self- 
concept test consisting of 29 personality de- 
scriptive phrases. An attempt was made to 
cover a broad range of needs, attitudes, and 
behavior tendencies with the 29 phrases. Some 
phrases were relevant to certain of Cattell’s 
16 personality factors (5); some were rele- 
vant to certain of Edwards’ variables (9); 
some were relevant to certain of the needs 
listed by Murray (17). Others were relevant 
to no particular list of personality character- 
istics. Examples of the 29 traits are: (a) 
tends to express anger easily and is apt to be 
found in arguments; (5) tends to be ener- 
getic, alert, spirited, and enterprising; has 
great vitality and drive; (c) tends to think 
of people as being somewhat selfish, unfair, 
and self-seeking; (d) tends to be cooperative, 
agreeable, and ready to meet people at least 
halfway.* 

Administration. Each phrase was typed on 

2To save printing costs, the complete list of 29 
traits is not presented here. It has been deposited 
with the American Documentation Institute. Order 


Document No. 5435, remitting $1.25 for microfilm or 
$1.25 for photocopies. 
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a 3X 5 card. The experimenter gave the S 
the 29 cards, one at a time, and asked him to 
sort them into five piles on the basis of the 
degree to which they described his “Ideal 
Self.” No restrictions were imposed concern- 
ing the number of cards which could be placed 
in a given pile. After this first sorting, the S$ 
re-sorted these same 29 cards into five piles 
on the basis of the degree to which they 
described his “Self.” Finally, the S re-sorted 
them on the basis of the degree to which they 
described his “Social Self.” “Ideal Self” was 
defined for the S$ as the self he would like to 
be—not what he thinks he ought to be but 
what he would like to be. “Self” was defined 
in terms of what the § thinks he actually is. 
“Social Self” was defined in terms of the S’s 
notion of what other people think of him. 
The meaning of other people was specified 
for the S by instructing him as follows: “By 
other people we . . . are referring to people 
who know you well enough to hold opinions 
about you, but people who are not your five 
or six closest friends.” 

The five piles into which the S$ sorted the 
29 traits were labeled in the same manner for 
all three trait sortings. The labels used were 
“Almost All of the Time,” “Usually,” “About 
Half of the Time,” “Seldom,” and “Almost 
Never.” When a trait was put in Pile 1, it 
was given a score of one. When it was put in 
Pile 2, it was given a score of two, etc. Thus, 
if a given S placed a given card in Pile 1 
when sorting on the basis of “Ideal Self,” 
Pile 3 when sorting on the basis of “Self,” 
and Pile 2 when sorting on the basis of “So- 
cial Self,” it was interpreted as meaning he 
would like that trait to be characteristic of 
him almost all of the time; he thinks that 
trait is characteristic of him about half of the 
time; and he thinks other people would say 
that trait is usually characteristic of him. 

Testing occasions. The S took the com- 
plete self-concept test on three separate occa- 
sions, once when medicated with morphine, 
once with amphetamine, and once with 
placebo. The three medications were adminis- 
tered subcutaneously in doses that varied ac- 
cording to body weight. The Ss received, per 
70 kg. of body weight, 10 mg. of morphine, 
15 mg. of amphetamine, and 1 ml. of physio- 
logic saline solution as a placebo. The three 
testing occasions were separated by intervals 
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of approximately one week. Order of drug ad- 
ministration was balanced. 

The three discrepancy measures (discrep- 
ancy between “Self” and “Ideal Self,” dis- 
crepancy between “Social Self” and “Ideal 
Self,” and discrepancy between “Self” and 
“Social Self”) were based on the ratings ob- 
tained under the placebo condition only. Thus 
these measures were in no way influenced by 
the pharmacologically active ingredients in 
morphine and amphetamine. The three insta- 
bility measures (instability of “Self,” “Ideal 
Self,” and “Social Self”) were based on the 
ratings obtained under all three conditions of 
medication. However, since the self-concept 
test was only one of a series of tests adminis- 
tered to evaluate drug effect, it was given ap- 
proximately 44 hours after medication; and 
at this time, the remaining drug effects were 
weak (12, 21). Thus, at present we believe 
the point to be emphasized concerning the in- 
stability variables is not that they reflect in- 
stability produced by drugs but, rather, that 
they reflect instability occurring over time. 
This belief is now being checked in a separate 
sample of Ss tested on three separate non- 
medicated occasions. 

Three discrepancy measures. The following 
method was used to determine each S’s dis- 
crepancy between “Self” and “Ideal Self.” 
The unsigned difference between the rating 
an S gave a trait on the “Self” dimension and 
the rating he gave that trait on the “Ideal 
Self” dimension was taken as the measure of 
his discrepancy between “Self” and “Ideal 
Self” with respect to that particular trait. The 
absolute difference between an S’s “Self” rat- 
ing and his “Ideal Self” rating was obtained 
for each of the 29 traits, and the mean of his 
29 absolute differences was taken as the meas- 
ure of his discrepancy between “Self” and 
“Tdeal Self” for the test as a whole. The dis- 
crepancy between “Self” and “Social Self” 
and the discrepancy between “Ideal Self” and 
“Social Self” were obtained in like manner. 

The mean of discrepancies method was em- 
ployed rather than the more frequently used 
correlation method because it was felt that 
the former is less influenced by such factors 
as number of traits comprising the trait-pool, 
response set bias, etc., than is the latter. 

Three instability measures. Each S’s “in- 
stability of Self” score was obtained as fol- 
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lows. The variance of the three ratings a 
particular S gave a particular trait on the 
“Self” dimension on the three test occasions 
was taken as that S’s degree of “instability of 
Self” for that particular trait. The variance 
of each S’s three ratings was obtained for each 
of the 29 traits. The mean of an S’s 29 vari- 
ances was taken as the measure of his “in- 
stability of Self” for the test as a whole. 
Measures of “instability of Social Self” and 
“instability of Ideal Self” were obtained in a 
like manner. 


Non-Self-Concept Personality Measures 


In addition to the six self-concept measures, 
49 “non-self-concept” personality measures 
were obtained for each S. Thirty-six of these 
were obtained from the following person- 
ality inventories: The Cattell 16 Personality 
Factors Questionnaire (4); the Maslow Se- 
curity-Insecurity Inventory (16); the Man- 
son Evaluation (15); the Edwards Personal 
Preference Schedule (8); the Minnesota 
Thinking, Social and Emotional Introversion- 
Extraversion Test (10). (The Edwards Test 
was taken under placebo; the rest were taken 
during the five nonmedication testing sessions 
mentioned in the following paragraph.) Of 
the 13 “non-self-concept” measures which 
were not obtained from personality inven- 
tories, 12 were measures of average-mood and 
one was a measure of self-insight. 

Average-mood measures. Information con- 
cerning average-mood was obtained as fol- 
lows: A total of 15 mood samples was ob- 
tained for a given S by having him rate him- 
self on a seven-point scale on each of 12 
bipolar mood items. These 15 samples of 
mood were obtained on the placebo day and 
on five nonmedication days, two of which 
preceded and three of which followed the 
three days on which-medications were given. 
The mood items were presented to the Ss in 
the form of bipolar seven-point scales, where 
each of the two poles was defined by a brief 
sentence. For example, the extremes of one 
mood item were specified in terms of the fol- 
lowing two sentences: “I feel a little grouchy 
and probably would get mad easily.” “I feel 
very good natured and probably would not get 
mad easily.” The following is a listing of the 
remaining 11 items (abbreviated): sleepy vs. 
awake; uneasy vs. secure; mind alert vs. mind 
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slow; dreamy vs. not dreamy; happy vs. sad; 
clear-headed vs. groggy; confident vs. shy; 
peppy vs. no pep; worried vs. not worried; 
calm vs. restless; unsociable vs. sociable. 

The mean of a given S’s 15 ratings for a 
given mood item was taken as the measure of 
his average-mood with respect to that par- 
ticular item. The 12 items comprising the 
mood questionnaire are based on a set of 
mood items developed by Lasagna, vonFel- 
singer, and Beecher (13). 

Self-insight measure. The self-insight meas- 
ure was obtained as follows: Two psycholo- 
gists, who had interviewed and tested each S 
for about 20 hours, rated each S on a five- 
point scale on each of the 29 personality traits 
comprising the self-concept test. After rating 
a given S on a given trait, the psychologist 
evaluated his confidence in that rating, mark- 
ing it high, medium, or low. The mean of the 
two psychologists’ ratings of an S on a given 
trait was then compared with the S’s own 
rating on that trait on the “Self” dimension 
under placebo. The amount of agreement be- 
tween the rating the S gave himself on that 
trait and the mean rating given to him by 
the two psychologists was taken as a measure 
of his self-insight with respect to that par- 
ticular trait. The discrepancy between a given 
S’s “Self” rating and the mean rating given 
to him by the two psychologists was obtained 
for each of the 29 traits. However, not all of 
the 29 discrepancies contributed to a given 
S’s self-insight score. One of the following two 
criteria had to be met before a trait was used: 
(a) a trait was used if both psychologists had 
rated the S exactly the same way on that 
trait and each psychologist had recorded high 
or medium confidence in his rating; (5) a 
trait was used if the two psychologists had 
disagreed by only one scale point and both 
had recorded high confidence in their rating 
of that S on that trait. The average number 
of traits meeting one or both of the above 
criteria was 18.5 per S. 


Results 


Intercorrelation Among the Six Self-Concept 
Measures 


Table 1 shows that each of the six variables 
is positively correlated with every other vari- 
able and that 11 of the n(m — 1)/2 = 15 dif- 
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Table 1 
Correlations Among the Six Self-Concept Variables 


(VY = 


24) 


Discrepancy Variables 


Self- Ideal- 
Variable Social Social 


Instability Variables 


Inst. Inst. Inst. 
Self Ideal Social 


Discrepancy 
Self-Ideal +.81"* +.74°* 
Self—Social +.51* 
Social-Ideal 

Instability 
Self 
Ideal 


+.56"° +.28 +.71°° 
+.71°° +.21 +.65** 
+.45* +.31 +.47* 


+.50* +.68** 
+.33 


* Significant at .05 level by two-tailed test. 

** Significant at .01 level by two-tailed test. 
ferent comparisons are significant at or be- 
yond the .05 level, using a two-tailed test. 
The correlation coefficients were obtained 
using the formula for the Pearson product— 
moment r. Each of the six variables was dis- 
tributed approximately normally. 


Search for Artifacts 


Finding everything correlated with every- 
thing is rather unusual. Therefore, before ac- 
cepting these correlations at face value, the 
procedure used to obtain the self-concept 
scores and the correlations among them was 
examined in an effort to discover possible 
artifacts. For example, the possibility that the 
positive correlations were due to some sort of 
common response set was examined. In search- 
ing for evidence relevant to this possibility, 
it was found that Ss differed from one an- 
other with respect to the way in which they 
used the five-point scale. For example, some 
Ss used predominantly neutral points on the 
scale, whereas others used predominantly ex- 
treme points on the scale. Similarly, some Ss 
tended to use only a limited range of the 
scale, whereas others tended to use the full 
range. If either tendency had been found to 
be correlated with any of the self-concept 
scores, the intercorrelation among the six self- 
concept variables would have become suspect. 
This would have been particularly true if the 
tendency to use neutral points or the tend- 
ency to use a restricted range had been found 
to be associated with low discrepancy or low 
instability. However, neither of these tend- 


encies was found to be significantly correlated 
with any of the six self-concept variables. 

In searching for possible artifactual mean- 
ings of the correlations among the self-concept 
variables, the question arose as to whether or 
not certain Ss got higher instability and dis- 
crepancy scores than other Ss simply because 
they were less careful and conscientious in 
making their responses. If amount of insta- 
bility and discrepancy were correlated with 
amount of carelessness in taking the test, the 
correlations among the six self-concept vari- 
ables would be spuriously increased. The Ed- 
wards consistency variable was taken as a 
measure of carefulness-carelessness in “per- 
sonality-test-taking.” (An S’s score on the 
Edwards consistency variable depends on the 
number of times he responds to identical test 
items in the same way.) The fact that this 
variable was not found to be significantly cor- 
related with any of the six self-concept vari- 
ables was taken as evidence against the as- 
sumption that correlations among the six self- 
concept variables were spuriously increased 
by variation among the Ss with respect to 
carefulness-carelessness in taking the self-con- 
cept test. 

Another possible source of error which was 
investigated was the fact that certain of the 
six self-concept scores are not based on com- 
pletely independent data. For example, the 
discrepancy between “Self” and “Ideal Self” 
and the discrepancy between “Social Self” 
and “Ideal Self” contain a common element: 
“Ideal Self.” This problem exists in the dis- 
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crepancy vs. discrepancy correlations but not 
in the instability vs. instability correlations 
and only to a very slight degree * in the in- 
stability vs. discrepancy correlations. Table 1 
indicates that although the correlations of the 
first of these three types are somewhat higher 
than those of the second and third types, the 
latter two types are nonetheless consistently 
high (8 out of 12 being significant at or be- 
yond the .05 level using a two-tailed test). 

It will be remembered that instability 
scores are based on variance values. An ex- 
amination of the nature of the distributions 
of the variances indicated that they are dis- 
tributed logarithmically rather than normally. 
Because the variances are logarithmically dis- 
tributed, extreme variance values on two or 
three of the 29 traits could greatly increase 
an S’s total instability score. It seemed pos- 
sible that in some way this might be spuri- 
ously increasing the correlations among the 
three instability variables. To determine 
whether or not the instability scores were in- 
deed being unduly influenced by a few .vems 
with high variance values, a new method of 
determining instability of “Self,” “Social 
Self,” and “Ideal Self” was developed which 
avoided this problem, and correlations be- 
tween scores yielded by the two methods were 
determined. With the new method, an S’s in- 
stability score depended on the number of 
traits, out of a possible 29, to which that S$ 
gave the same rating on all three test occa- 
sions. This new method of obtaining insta- 
bility scores gave the Ss scores which were 
very similar to those obtained with the mean 
of variances technique. The correlations be- 
tween instability scores based on the two 
methods were + .86, + .84, and + .79 for in- 
stability of “Self,” “Social Self,” and “Ideal 
Self,” respectively. 

3 Six of the nine correlations between discrepancy 
and instability measures have a minor degree of non- 
independence. For example, one of the two sets of 
ratings contributing to an S’s score on discrepancy 
between “Self” and “Ideal Self” is the same as one 
of the three sets of ratings contributing to his “in- 
stability of Self” score, viz., his ratings of “Self” un- 
der placebo. However, it is not clear how this type 
of nonindependence could spuriously inflate the cor- 
relations involved. Furthermore, of the three correla- 
tions which are completely independent, one reaches 
the .01 level of significance and one reaches the .05 
level of significance, using a two-tailed test. 
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Perhaps the most convincing evidence 
against the supposition that the intercorrela- 
tion among the six self-concept variables is 
due to artifacts was obtained by comparing 
the discrepancy vs. discrepancy correlations 
and instability vs. instability correlations with 
the discrepancy vs. instability correlations. 
As was pointed out before, the discrepancy 
scores are based on the subtractive relation- 
ship between ratings of the traits with respect 
to two different “dimensions” on the same oc- 
casion (the placebo day), whereas the insta- 
bility scores are based on the variance of rat- 
ings of the traits with respect to the same 
“dimension” over three different occasions. 
That is, the discrepancy scores and insta- 
bility scores were obtained using different op- 
erations and (with the minor exception men- 
tioned in Footnote 3) were based on different 
data. In view of this, it seems reasonable to 
assume that sources of error not discovered 
in the search for artifacts, discussed above, 
would be much more likely to affect the within 
class correlations (discrepancy vs. discrep- 
ancy, or instability vs. instability) than the 
across class correlations (discrepancy vs. in- 
stability). Following this assumption, all 15 
r values were transformed to z values, the 
mean z value for the six within class corre- 
lations and the mean z value for the nine 
across class correlations were obtained, and 
these two mean z values were transformed 
back to r values. The corrected average r 
value for the across class correlations was 
high (.51) and was encouragingly close to 
the corrected average r value for the within 
class correlations (.62). 


Reliability of the Six Self-Concept Measures 


Split-half reliability coefficients were ob- 
tained for all six measures, and test-retest 
reliability coefficients were obtained for the 
three discrepancy measures. The period in- 
tervening between the test and the retest 
varied from S to S. The minimum was 228 
days; the maximum was 420 days; the mean 
was 330 days. Using the Spearman-Brown 
correction, the split-half reliability coefficients 
for the three discrepancy variables were: 
“Self”—“Ideal Self” = + .88; “Social Self’- 
“Tdeal Self” = + .85; “Self—“Social Self” = 
+ .91; those for the three instability vari- 
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Table 2 
Correlations Between Six Self-Concept Variables and 49 Non-Self-Concept Variables 


Discrepancy Variable Instability Variable 


Self- Ideal- Self- Inst. Inst. Inst. 
Measure Ideal Social Social Self i 


Self-insight measure —.57** —.41* - —.55** 


Average-mood measures 
+.06 . +.21 +.30 
+.37 +.46* +.53f* 
+.16 +.24 +.48 
+.18 +.23 +.28. 
+.26 +.29 +.36 
+.42* 
+.32 
+.28 
+.34 
+.52** 
+.48* 
+.21 
Manson maladjustment +.64** 
Maslow insecurity +.65** 
Edwards variables 
Achievement d —.26 
Deference 
Order 
Exhibition 
Autonomy 
Affiliation 
Intraception 
Succorance 
Dominance 
Abasement 
Nurturance 
Change 
Endurance 
Heterosexuality 
Aggression 
TSE extraversion 
Thinking 
Social 
Emotional 
Cattell factors 
A—Cyclothymia 
B—Intelligence 
C—Emot. stability 
E—Dominance 
F—Surgency 
G—Charac. strength 
H—Adventurous 
I—Emot. sensitivity 
L—Paran. schizothym. 
M—Bohemianism 
N—Sophistication 
O—Anxious insecurity 
Q:—Radicalism 
Q:—Independent 
Q;:—Will control 
Q.—Nervous tension 


Note. —For correlations involving Cattell F actors. N = 20. For all other correlations, N = 24, 
* Significant at .05 level by two-tailed test. ** Significant at .01 level by two-tailed test. 
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+28 +.03 

+.59%*  +.46° 

+32 +.29 

+36 +.21 

+41* +4.7* 

+.60%* +.51* 
+49% +.27 

+48* +.40* 

+30 +.22 

+.50* +.30 

+39 

+30 

+40* +.46* 

+4.43* 
-13 

+05 +.22 

+.03 

0 

+06 +.14 

-22 

+.43* 

+16 4.17 

+06 —.A45* 

+34 4.21 

+02 +.23 

+.13 

-19 —.36 

+12 

+32 -.10 

-34 
+08 +.26 

—11 

+38  +.05 

—12  +.14 

+08 

+.04 

-2 -04 

+13 

+05 +.26 

+12 +.14 

-08 +.24 
+26 +34 

-02 

+09 +41 

-37 
+13 +.46* 
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ables were: “Self” = + .78; “Social Self” = 
+ .86; “Ideal Self” = + .75. The test-retest 
coefficients for the three discrepancy variables 
were: “Self’—“Ideal Self” = + .82; “Social 
Self”—“Ideal Self” = + .63; “Self”—“Social 
Self” = + .78. All of these correlations were 
plotted to be certain that they were not be- 
ing spuriously increased by a few Ss with ex- 
treme scores. It was found that the correla- 
tions held throughout the scattergrams. 

In a subsequent study of the reliability of 
the three discrepancy variables, in a group of 
Ss whose discrepancy scores covered narrower 
ranges * than those covered by the 24 Ss in 
the present study, somewhat lower reliability 
coefficients were obtained. In the subsequent 
experiment, the three discrepancy variables 
yielded split-half coefficients of + .79, + 83, 
and + .75. With an interval of two weeks, 
the test-retest coefficients were + .77, + .77, 
and + .60. The self-concept test is now un- 
dergoing item analysis in an effort to increase 
its reliability when used with Ss whose scores 
cover only moderate ranges. 


Correlations Between the Six Self-Concept 
Measures and Other Personality Meas- 
ures 


Table 2 reports correlations between each 
of the six self-concept variables and 49 “non- 
self-concept” measures of personality. A strik- 
ing characteristic of the correlations reported 
in Table 2 is the consistency of the associa- 
tion between measures of adjustment and 
measures of self-concept discrepancy and in- 
stability. These correlations indicate that high 
instability scores and high discrepancy scores 
are associated with poor adjustment scores, 
whereas low instability scores and low dis- 
crepancy scores are associated with good ad- 
justment scores. 

The consistent correlation between meas- 
ures of self-concept discrepancy and insta- 
bility on the one hand, and measures of 

*For example, in the subsequent experiment, the 
S with the highest total discrepancy between “Self” 
and “Ideal Self” had an average item discrepancy of 
1.36 scale points, and the S with the lowest total dis- 
crepancy between “Self” and “Ideal Self” had an av- 
erage item discrepancy of 0.32 scale points. In con- 
trast, in the experiment reported in the present paper, 
the highest average discrepancy between “Self” and 
“Tdeal Self” was 1.93 scale points, and the lowest was 
0.14 scale points. 
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adjustment on the other hand, suggests one 
reason for the intercorrelation among the self- 
concept variables—namely, that the six self- 
concept variables are correlated with each 
other because all of them are correlated with 
a common variable: adjustment. 


Discussion 


Psychological Meaning of the Six Self-Con- 
cept Measures 


Since the meaning of the correlations re- 
ported in Tables 1 and 2 depends on the 
meaning of the six self-concept variables in- 
volved in these correlations, it seems desir- 
able to specify what we at present assume to 
be the psychological meaning of these six 
variables. The meaning of each variable will 
be given by indicating the psychological mean- 
ing which we attribute to a high score on that 
variable. 

1. A high discrepancy between “Self” and 
“Tdeal Self” indicates that the S feels inade- 
quate relative to his ideal; that is, he evalu- 
ates himself unfavorably. 

2. A high discrepancy between “Social Self” 
and “Ideal Self” indicates that the S feels 
that other people perceive him in ways which 
are very different from Ais standard of per- 
fection; that is, he feels that their perceptions 
of him are, relative to his own ideal, unfa- 
vorable. 

3. A high discrepancy between “Self” and 
“Social Self” indicates that the S thinks that 
other people do not accurately perceive and 
understand him. 

4. High “instability of Self” indicates that 
the S’s attitudes toward himself undergo 
marked change from time to time. 

5. High “instability of Social Self” indi- 
cates that the individual changes markedly 
from time to time concerning what he believes 
other people think of him. 

6. High “instability of Ideal Self” indi- 
cates that the individual’s goals, values, and 
ideals fluctuate markedly over time. 


Interrelations Among the Six Self-Concept 
Measures 


If the above attribution of meaning to the 
six self-concept variables is accepted as rea- 
sonable, then the positive correlations among 
these six variables become quite reasonable. 
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It is not surprising, for example, that the S 
who evaluates himself unfavorably (high dis- 
crepancy between “Self” and “Ideal Self’’) 
also tends to react in the following ways: he 
feels that he is evaluated unfavorably by 
others (high discrepancy between “Social 
Self” and “Ideal Self”); he feels misper- 
ceived by others (high discrepancy between 
“Self” and “Social Self”); he changes mark- 
edly from time to time concerning what he 
thinks he is (high “instability of Self”); he 
changes markedly from time to time concern- 
ing what he thinks other people think of him 
(high “instability of Social Self”); and he 
changes markedly from time to time concern- 
ing what he would like to be (high “insta- 
bility of Ideal Self”). Similarly, it is not sur- 
prising that the S who evaluates himself 
favorably also feels evaluated favorably by 
others, also feels understood by others, and 
also has rather stable ideas about what he is, 
what he believes others think of him, and 
what he would like to be. 

Stating that the intercorrelation among the 
six self-concept variables is reasonable and is 
not due to experimental artifacts does not 
explain this intercorrelation. To attempt a 
theoretical explanation of the intercorrela- 
tion among the six self-concept variables 
would involve a rather extensive manipulation 
of some empirical findings and many hunches. 
These findings and hunches would perhaps 
most profitably deal with the events occur- 
ring early in childhood which might affect 
the development of the total configuration of 
ideas and valuations which the individual has 
concerning himself. An extensive endeavor of 
this sort does not seem appropriate in the 
present report. 

Dynamics. It might be useful, however, to 
mention just a few observations and hunches 
(suggested by numerous observers of human 
behavior: W. James, G. H. Mead, P. Lecky, 
C. Rogers, H. S. Sullivan, K. Horney, to 
mention a few) which might be helpful in 

5 Following Feigl’s (11) definition of explanation 
as being the “inductive-deductive or (on higher lev- 
els) hypothetico-deductive derivation of the more 
specific (ultimately descriptive) propositions from 
more general assumptions (laws, hypotheses, theo- 
retical postulates) in conjunction with other descrip- 


tive propositions (and often together with defini- 
tions) .” 
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developing such an explanation. For example, 
the statements presented in the following 
three paragraphs might, if they are valid, 
shed some light on the factors involved in the 
production of the correlations between the 
“Social Self’—“Ideal Self” discrepancy and 
the other five measures of self-concept dis- 
crepancy and instability. (Earlier in this sec- 
tion we assumed that a high discrepancy be- 
tween “Social Self” and “Ideal Self” indicates 
that the S feels that the perceptions of him 
held by other people are, relative to Ais own 
standard of perfection, unfavorable. For the 
sake of speculation, we now assume that the 
degree of discrepancy between “Social Self” 
and “Ideal Self” reflects the degree to which 
the individual feels valued by other people.) 
The following statements are presented merely 
to exemplify the way in which speculation 
about events which might influence the de- 
velopment of various aspects of the self-con- 
cept might give some meaning to the five cor- 
relations involving the “Social Self’—“Ideal 
Self” discrepancy. (The remaining 10 corre- 
lations will not be treated in this manner in 
the present report.) 

Concerning the correlation between the “So- 
cial Self’—“Ideal Self” discrepancy and the 
“Self” —“Ideal Self” discrepancy, where r = 
+ .74, one might speculate as follows: The 
child’s ideas about, and valuations of, him- 
self are profoundly influenced by what he be- 
lieves to be the ideas held about him and the 
valuations made of him by other people with 
whom he interacts. That is, early in life the 
individual learns what to think of himself by 
observing what others think of him. If one 
feels highly valued by others, he is likely to 
value himself highly. If one does not feel 
highly valued by others, he is not likely to 
value himself highly. 

Concerning the correlation between the 
“Social Self’—“Ideal Self” discrepancy and 
the “Self’—‘Social Self” discrepancy, where 
r = + .51, one might say: To feel valued by 
others is a powerful human motive. If an in- 
dividual feels that he is not valued by others, 
he may attempt to deal with the disappoint- 
ment by concluding that he is not valued be- 
cause he is not understood. 

Concerning the correlations between the 
“Social Self’—“Ideal Self” discrepancy and 
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the instability of “Ideal Self,” “Self,’ and 
“Social Self,” where the r values are + .31, 
+ 45, and + .47, respectively, one might 
say: The frustration of an individual’s desire 
to feel valued by others, and the consequent 
attempts to understand why he is not valued, 
could lead to uncertainty and confusion con- 
cerning (a) the ideals and standards he should 
try to maintain, (6) the qualities and charac- 
teristics which he feels comprise his “Self,” 
and (c) the validity of his notions of what 
other people think of him. 

Additional data needed. Before engaging in 
further speculation concerning the possible 
meanings of the positive correlations among 
the six self-concept variables, it seems desir- 
able to obtain considerably more empirical 
data dealing with such questions as: Are the 
correlations reported in the present paper re- 
peatable; and, if so, in what population of 
Ss? What personality characteristics distin- 
guish individuals who contribute to these 
positive correlations from individuals who do 
not? In what ways are the six self-concept 
variables similar to and different from each 
other in meaning, i.e., what personality vari- 
ables correlate similarly with the six self- 
concept variables with respect to magnitude 
and direction; and what personality variables 
correlate differentially with the six self-con- 
cept variables with respect to magnitude and 
direction? How do these six self-concept vari- 
ables relate to other self-concept dimensions? 
It might be particularly worthwhile to deter- 
mine the way in which these six variables 
relate to a measure of Lecky’s (14) concept 
of “self-consistency.” What are the develop- 
mental relations among the six self-concept 
variables? For example, one might expect the 
development of the ideas and attitudes re- 
flected by the “Social Self’—“Ideal Self” dis- 
crepancy to be important in affecting the de- 
velopment of those reflected by the other five 
variables. Do the instability variables reflect 
the degree of unclarity of ideas and attitudes 
relating to the self; or do they reflect con- 
flicting and mutually inconsistent ideas and 
attitudes; or do they reflect both; or do they 
reflect neither; or do they reflect one in some 
Ss, the other in some Ss, both in some Ss, and 
neither in some Ss? 
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Relations Between the Six Self-Concept Vari- 
ables and Other Personality Measures 


At present, perhaps the most useful way of 
giving meaning to the six self-concept vari- 
ables is by specifying the way in which they 
are correlated with other measures of person- 
ality which have already acquired meaning. 
Table 2 presents these correlations. Ideally, 
the meaning of the “non-self-concept” meas- 
ures should be given by specifying the items 
on which they are based and by presenting 
the validity information concerning these 
measures. However, space limitations prevent 
this in the present paper. 

In the results section, it was stated that 
there is a highly consistent tendency for high 
discrepancy and high instability of self-con- 
cept to be associated with poor adjustment 
and for low discrepancy and low instability 
to be associated with good adjustment. The 
acceptability of this statement depends on 
the acceptability of the assumption that the 
“non-self-concept” measures, which are cor- 
related with the self-concept variables, do in- 
deed measure some aspect of maladjustment- 
adjustment. In the present paper, extensive 
effort to justify this assumption will not be 
made. 

Self insight. Table 2 indicates that all six 
self-concept measures are negatively corre- 
lated with self-insight at or beyond the .05 
level of confidence, using a two-tailed test. 
That is, high discrepancy on all three dis- 
crepancy measures and high instability on all 
three instability measures are associated with 
low self-insight. This finding is congruous 
with Rogers’ conception (20) of the relations 
among adjustment, self-awareness, and dis- 
crepancy between “Self” and “Ideal Self.” 

Average mood. Table 2 shows that 19 of 
the 72 correlations between self-concept meas- 
ures and measures of average-mood are sig- 
nificant at or beyond the .05 level, using a 
two-tailed test. These 19 correlations indicate 
(and the remaining 53 suggest) that high self- 
concept discrepancy and high self-concept in- 
stability are associated with reporting feeling 
uneasy, grouchy, worried, nervous, groggy, 
shy, sad, not peppy, mentally slow, dreamy, 
unsociable, and sleepy. Of the 53 correlations 
between self-concept variables and average- 
mood variables which do not reach the .05 
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levei of significance, only three fail to be con- 
sistent with the statement that high self-con- 
cept discrepancy and instability are associ- 
ated with the mood characteristics specified 
in the previous sentence. These 72 correla- 
tions yield an unusually consistent picture. If 
an S’s ratings of his mood on the 12 mood 
items on the 15 different occasions are taken 
as valid indications of the way he generally 
feels, the correlations between self-concept 
measures and average-mood measures indi- 
cate that Ss with high self-concept discrep- 
ancy and high self-concept instability feel 
more anxious, irritable, unhappy, and inade- 
quate than do Ss with low self-concept dis- 
crepancy and low self-concept instability. 

These correlations between self-concept 
scores and average-mood scores are quite rea- 
sonable. It is not surprising that the person 
who disapproves of himself (high discrepancy 
between “Self” and “Ideal Self”) and feels 
others disapprove of him (high discrepancy 
between “Social Self” and “Ideal Self”) and 
feels misunderstood by others (high discrep- 
ancy between “Self” and “Social Self”) and 
is unable to maintain a stable notion of what 
he is, what he wants to be, or what he thinks 
others think of him (high instability of “Self,” 
“Tdeal Self,” and “Social Self”) would feel 
anxious, irritable, unhappy, and inadequate. 
Similarly, it is not surprising that the person 
who approves of himself and thinks others 
approve of him and feels understood by others 
and has a stable notion of what he is, what 
he wants to be, and what he thinks others 
think of him, would feel secure, friendly, 
happy, and adequate. 

Manson and Maslow questionnaires. The 
correlations between self-concept measures 
and the Manson and Maslow tests indicate 
that high self-concept discrepancy and insta- 
bility are associated with high “Manson in- 
security” and high “Maslow maladjustment.” 
Eleven of these twelve correlations are sig- 
nificant at or beyond the .05 level of con- 
fidence, using a two-tailed test. The z trans- 
formed average r for these 12 correlations is 
+ .55. Thus, it is clear that the aspects of 
adjustment-maladjustment which are meas- 
ured by the Manson and Maslow tests are 
very similar to those aspects of adjustment- 
maladjustment which are measured by the six 
self-concept variables. 
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Edwards variables. Most of the correlations 
between the self-concept variables and the 
Edwards variables are not high enough to be 
significant at the .05 level using a two-tailed 
test. However, when the total configuration of 
correlations between the Edwards variables 
and the self-concept variables is examined, it 
is seen that high discrepancy and high in- 
stability consistently tend to be associated 
with need characteristics which are frequently 
found in poorly adjusted individuals. (The 
correlation between “instability of Ideal Self” 
and need intraception may be an exception.) 
Need succorance, need dominance, need en- 
durance, and need intraception are associated 
with one or more of the six self-concept vari- 
ables at the .05 level of confidence using a 
two-tailed test. High instability and discrep- 
ancy are associated with high need succorance, 
low need dominance, low need endurance, and 
high need intraception. Although not signifi- 
cant at the .05 level of confidence, the corre- 
lations between the self-concept scores and 
scores on need abasement, need achievement, 
and need aggression indicate that high insta- 
bility and discrepancy might be associated 
with high need abasement, low need achieve- 
ment, and high need aggression. 

T-S-E measures. Only the Social scale in 
the Minnesota Thinking, Social, and Emo- 
tional Introversion-Extraversion Test corre- 
lates significantly with the self-concept vari- 
ables. Four of the six self-concept variables 
correlate with the Social Introversion-Fxtra- 
version variable at or beyond the .05 level of 
confidence, using a two-tailed test. High self- 
concept discrepancy and instability are asso- 
ciated with social introversion. 

Cattell factors. The correlations between 
the Cattell Factors and the self-concept vari- 
ables agree with the findings discussed above 
in that high self-concept discrepancy and in- 
stability are associated with poor adjustment. 
Table 2 indicates that one or more of the 
six self-concept variables is significantly posi- 
tively correlated with Factors M, O, and Q,, 
and significantly negatively correlated with 
Factors, C, E, and Q3. Although not signifi- 
cant at the .05 level of confidence, there is a 
consistent tendency for the self-concept vari- 
ables to be positively correlated with Factor 
L. In other words, high discrepancy and in- 
stability are associated with high Factor M 


(“Hysteric Unconcern” or “Bohemianism”), 
high Factor O (“Anxious insecurity”), high 
Factor Q, (“Nervous tension”), low Factor 
C (“Emotional stability’), low Factor E 
(“Dominance or ascendance”), low Factor 
Qs (“Will control and character stability’’), 
and high Factor L (“Paranoid schizothymia”) . 
Cattell has recently reported (6) that five of 
these factors (viz., O, Qs, C, Qs, and L) are 
found to comprise a second-order factor, 
which he names Anxiety vs. Dynamic Inte- 
gration. Cattell suggests three alternative 
interpretations of this second-order factor: 
“first, that it is a factor of general anxiety; 
second, a factor of general neuroticism; and 
third, that it is the different form of per- 
sonality disorganization associated with psy- 
choticism” (6, p. 416). In this second-order 
factor, the primary factors O, Q4, and L have 
positive signs, while Q; and C have negative 
signs. Of the 30 correlations between the six 
self-concept variables and the five primary 
factors comprising this second-order factor, 
29 are consistent with the conclusion that 
high scores on the six self-concept variables 
are indicative of poor adjustment. Seven of 
these 29 correlations are significant at or be- 
yond the .05 level using a two-tailed test. 

General comments. The principal findings 
of the present study are that the six self- 
concept variables are positively correlated 
with each other and are negatively correlated 
with adjustment. The high intercorrelation 
among the six self-concept variables (Table 1) 
and the consistency of their correlation with 
various measures of adjustment (Table 2) 
help clarify the meaning of these six self-con- 
cept variables by indicating the ways in which 
they are similar. However, clarification of 
meaning can also be obtained by indicating 
the dissimilarities among these six variables. 
Data obtained from a sample considerably 
larger than that of the present siudy, ana- 
lyzed with factorial techniques, would be use- 
ful in this respect. 

Even within the present data, however, 
there are some suggestions concerning the 
ways in which the six self-concept variables 
might differ. For example, Table 1 indicates 
that “instability of Ideal Self” is probably 
the most unique of the six variables. Table 2 
also offers some suggestions concerning pos- 
sible differences among these six variables. 
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For example, the Cattell factors correlate 
higher with the discrepancy variables than 
with the instability variables. This is par- 
ticularly true of the correlations involving the 
discrepancy between “Self” and “Ideal Self.” 
On the other hand, the average-mood vari- 
ables correlate higher with the instability 
than with the discrepancy variables. This is 
particularly true of the correlations involving 
the “instability of Social Self.” 

Theorists and experimenters have found the 
discrepancy between “Self” and “Ideal Self” 
to be a useful measure. Perhaps further study 
of the other five self-concept measures dis- 
cussed in the present paper would show them 
also to be of value. This would be particularly 
likely in the case of the discrepancy between 
“Social Self” and “Ideal Self,” if it were found 
that the development of the ideas and atti- 
tudes reflected by this measure is important 
in affecting the development of the ideas and 
attitudes reflected by the other five self-con- 
cept variables. 

Further study of the three instability vari- 
ables also seems worthwhile. Inability to 
maintain a stable conception of the “Self,” 
“Tdeal Self,” and “Social Self” is associated 
with low self-insight and with other aspects 
of maladjustment, such as anxiety, hostility, 
dejection, and feelings of inadequacy. A thor- 
ough understanding of this association should 
be useful in the further development of per- 
sonality theory and therapeutic technique. If 
it were found, for example, that individuals 
with high “instability of Self” scores allow 
their self-evaluations to be inordinately influ- 
enced by immediately preceding successes, 
failures, approvals, disapprovals, etc., whereas 
individuals with low scores anchor their self- 
evaluations firmly in the events of their total 
past, then the meaning of the association be- 
tween “instability of Self” and various meas- 
ures of adjustment would become clarified. 


Summary 


Scores on six self-concept measures were ob- 
tained for each of 24 college men. These six 
measures were: discrepancy between “Self” 
and “Ideal Self,” discrepancy between “Self” 
and “Social Self,” discrepancy between “So- 
cial Self” and “Ideal Self,” “instability of 
Self,” “instability of Ideal Self,” and “in- 
stability of Social Self.” All six measures were 
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found to be positively correlated with each 
other. Eleven of the 15 correlations were sig- 
nificant at the .05 level of confidence using a 
two-tailed test. The possibility of spuriously 
high intercorrelation among the six self-con- 
cept measures was examined, and the reasons 
for rejecting this possibility were discussed. 

Split-half reliability coefficients were ob- 
tained and reported for all six measures, and 
test-retest coefficients were obtained and re- 
ported for the three discrepancy measures. 

The psychological meanings presently at- 
tributed to the six self-concept measures were 
specified, and it was concluded that in view 
of these meanings, the intercorrelation among 
the six measures is reasonable. Five of the 15 
correlations among the six measures (namely, 
those involving the “Social Self”—“Ideal Self” 
discrepancy) were discussed in some detail, 
and dynamics which could give rise to these 
five correlations were suggested. Certain ques- 
tions which appear to be worthy of experi- 
mental examination were raised concerning 
the meaning of the six measures and concern- 
ing the meaning of their interrelations. 

The correlations between each of the six 
self-concept measures and each of 49 other 
measures of personality were presented and 
discussed. One of these 49 was a measure of 
self-insight; 12 were measures of average- 
mood; the remaining 36 measures were ob- 
tained with personality inventories. On the 
basis of these 294 correlations, it was con- 
cluded that all six self-concept variables are 
highly related to such aspects of adjustment 
as self-insight, anxiety, dejection, friendliness, 
insecurity, social introversion, will control, 
etc. High discrepancy scores on all three dis- 
crepancy measures, and high instability scores 
on all three instability measures were found 
to be associated with poor adjustment scores. 

It was suggested that one possible reason 
for the intercorrelation among the six self- 
concept variables is their correlation with a 
common variable: adjustment. 


Received May 3, 1957. 
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Differentiation of Clinical Groups Using 
Canonical Variates’ 


H. R. Beech and A. E. Maxwell 
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The present study is part of an investiga- 
tion into the factors which influence the re- 
production of simple designs by drawing, and 
represents an extension of work by Shapiro 
(11, 12, 13) and Yates (14) into rotation 
effects. The generalizations put forward by 
Shapiro to account for the greater amounts 
of rotation found among brain-damaged sub- 
jects (13) were used in predicting certain 
other anomalies which would be found in the 
drawing reproductions of that group. The 
main notion which led to the measures (i.e. 
measures other than the rotation of designs) 
being taken in the recent study was that ex- 
aggerated negative induction effects result in 
the restriction of the available perceptual 
cues in brain-damage patients. 

This paper presents the results obtained 
from a canonical variate analysis employing 
four measures or tests derived from the ap- 
plication of Shapiro’s generalizations about 
the effects of brain damage to the Drawing 
Rotation Test (14). The main aim of the 
analysis was to improve the differentiations 
obtained between brain-damaged and nonor- 
ganic groups. Other studies (4, 5) making 
use of this statistical technique have produced 
favourable results so far as differentiation is 
concerned. 


The Measures 


The Drawing Rotation Test comprises 40 
model designs which are placed in front of 
the subject (S) at specified distances, and he 
is asked to copy them by drawing. All the de- 


1 The work reported in this paper was made pos- 
sible by a grant from the Research Fund, made 
available from the endowment by the Board of Gov- 
ernors of the Bethlem Royal and Maudsley Hospitals. 


signs are simple, symmetrical, and capable of 
reproduction using four Kohs blocks. Ten dif- 
ferent designs are used in all, each of which 
appears four times in the test series, once in 
each of the four possible combinations of fig- 
ure and ground orientations. Three examples 
of these designs are shown in Fig. 1. The 
models may be in either square or diamond 
orientation, and it has been found (12, 13) 
that organic Ss show a tendency to rotate 
their reproductions (i.e. to produce discrep- 
ancies in orientation between the model and 
their reproduction), which is greater than 
that found among nonorganic patients. Our 
first measure, therefore, was an index of the 
extent to which reproduction misorientation 
occurred, this index being the mean rotation 
score for the 40 designs given. We have called 
this measure Test 3. A measure of the time 
taken to complete the first eight designs was 
called Test 4. In the Rotation Test an S is 
allowed up to three minutes in which to com- 
plete any design, although failure to do so in 
the given time is very rare. The time meas- 
ure, which has been found to differentiate be- 
tween normals and abnormals (2), was the 
mean time, in seconds, which the S took to 
complete the first eight designs out of the 
series of forty. The decision to consider only 
eight designs for Test 4 was quite arbitrary 
and purely a matter of convenience in mak- 


Fig. 1. Three examples of the designs used in the 
study. 
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ing the necessary measurements. This limita- 
tion of the number of designs also applied to 
Tests 1 and 2 where the scoring procedure 
was too time-consuming to permit considera- 
tion of a large number of reproductions. 

Test 1, called disproportionality (2), may 
be looked upon as an index of the magnitude 
of errors in bisection. Almost all the designs 
used in the test have their internal structure 
so arranged that certain lines exactly bisect 
the external framework of the design. Dis- 
proportionality represents a ratio of parts of 
the design which, because of bisecting inter- 
nal parts, should be equal in length. Lack 
of proportionality, or disproportionality, was 
measured by dividing the longer of the two 
lengths which should be equal by the shorter. 
In all, twenty-five of these indices were ob- 
tained from the first eight designs, the mean 
disproportionality score being the sum of the 
ratios (taken to three decimal places) di- 
vided 'by 25. 

Test 2, colour score, represents an attempt 
to estimate the extent to which the yellow 
aspects of an S’s reproductions occupy a 
larger amount of the available space than the 
blue aspects. All the designs are in blue and 
yellow. The S is asked not only to draw the 
design but also to indicate which parts are 
blue and which yellow in his reproduction. 
The measurements taken are based upon the 
fact that in each design equal amounts of 
space are given to the two colours, so that 
any tendency to exaggerate yellow aspects at 
the expense of blue will be discovered by tak- 
ing a ratio of blue to yellow parts. In prac- 
tice, this effect is tapped by comparing blue 
and yellow lengths where these colours form 
borders to the external framework of the de- 
sign, always dividing the yellow length by 
the blue. 

Both colour and disproportionality scores 
have similarities in that both deal with parts 
of the designs which should be equal and both 
involve ratios of these parts, but it has been 
shown (2) that there is no necessary corre- 
spondence between the two measures. 


The Samples 


Five groups of Ss were used in the sta- 
tistical analysis of the four measures, 48 nor- 
mals, 20 neurotics, 26 schizophrenics, 17 de- 
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pressives, and 33 brain damaged, making a 
total of 144 Ss. Most of these Ss were tested 
by three other investigators in the Maudsley 
Hospital as part of their research program. 
We are indebted to these experimenters for 
allowing us to make use of the drawings col- 
lected by them. Special care was taken in the 
case of functional patients and normal Ss to 
exclude those persons with symptoms or his- 
tory which were consistent with the possi- 
bility of brain damage. 


The normals comprised Ss drawn from adult edu- 
cation classes, nurses, secretarial staff, and other 
miscellaneous occupations. 

The brain-damaged group could be divided into 
two subgroups: a group of patients from the Neuro- 
surgical Unit of the Maudsley Hospital whose brain 
injury was a result of surgical interference (e.g., 
temporal lobectomy cases), and a group whose brain 
injury was due to some disease process (e.g., G.P.I.). 
It was found that these two categories did not dif- 
fer significantly in terms of their drawing test scores. 

The depressives were not all psychotic conditions 
but included some neurotic depression cases. The 
main distinguishing feature of this group was that 
all the individuals comprising the group were to re- 
ceive ECT as part of their treatment to relieve a 
depressed condition. 

The schizophrenics included early and chronic 
cases in about equal proportions. They were drawn 
mainly from the Maudsley Hospital. 

The neurotic group was composed almost entirely 
of patients diagnosed as “hysteric” or “anxiety state.” 
All of these Ss were drawn from the patient popula- 
tion of the Bethlem and Maudsley Hospitals. 


In all, five groups of Ss were tested by four 
different experimenters at more than one 
hospital. 


The Problem 


Comparing the five groups on the four 
Drawing Test measures produced significant 
differences between them. The statistical de- 
tails of these differences are shown in Table 1. 
It will be seen that where disproportionality 
(Test 1), colour (Test 2), and rotation (Test 
3) are concerned the differences are mainiy 
between brain-damaged and non-brain-dam- 
aged groups, while on time (Test 4) the dif- 
ferentiation is mainly between normal and 
abnormal groups. 

However, although significant differences 
between the groups on these measures are ob- 
tained, we are left with large areas of mis- 
classification, and the results have limited 


Differentiation of Clinical Groups Using Canonical Variates 


Table 1 
Significance of Differences Between the Groups on the Drawing Test Measures 


F ratios 


Test 1 
9.379** 


Test 4 
19.613** 


Test 3 
3.743** 


t (2-tail) 


Groups compared 


Test 1 


Test 2 Test 3 Test 4 


Brain damaged v. normal 
Brain damaged v. depressive 
Brain damaged v. schizophrenic 
Brain damaged v. neurotic 
Normal v. depressive 

Normal v. schizophrenic 
Normal v. neurotic 

Depressive v. schizophrenic 
Depressive v. neurotic 
Schizophrenic v. neurotic 


0.139 
1.773 
1.088 
1.514 
0.477 
0.477 


3.538** 
2.547* 

4.726** 
3.856** 


4.192** 
2.488* 
5.446** 
4.006** 
0.757 
1.891 
0.662 
2.176* 
1.187 
0.477 


3.337°° 
1.714 
2.378* 
3.217** 
0.977 
0.541 
0.587 
0.458 
1.310 
0.968 


7.920** 
1.731 
1.150 
3.252™° 
4.001** 
6.220** 
0.689 
1.225 
2.081* 


Note.—Tests 1, 2, 3, and 4 refer to the measures Disproportionality, Colour, Rotation, and Time, respectively. 


* Significance of .05 or above. 
** Significance of .01 or above. 


clinical value. Even in the case of brain- 
damaged subjects, the outcome is disappoint- 
ing, for the bulk of this group overlaps with 
other groups. 

The main task at this stage, therefore, was 
to attempt to improve the differentiation be- 
tween the groups. The method chosen to 
achieve this end was, as stated above, a dis- 
criminant function analysis. The statistical 
model employed in such an analysis is now 
briefly described. 


The Statistical Model 


The model is one of finding a linear func- 
tion, or orthogonal linear functions, of the 
test scores which will best discriminate be- 
tween the groups. If x, to x4 are taken to rep- 
resent the four tests and ™, to uw, are weights, 
such a function has the form: 


f(x) = + + + (1) 


It will be realised that when only two 
groups are concerned, a solution to the prob- 
lem is provided by Fisher’s linear discrimi- 
nant function (6). When more than two 
groups are concerned, it cannot be assumed 
that the group means are colinear and a 
method is required which will indicate the 
number of dimensions in which the group 
means lie. One approach to the problem is 


that described by Rao (9) and used by Rao 
and Slater (10). The procedure involves 
maximising Mahalanobis’ D*. Another ap- 
proach is that described by Lubin (7) and 
illustrated in a recent article by Eysenck (3). 
Here the square root of the correlation ratio, 
n, between the groups, on the one hand, and 
a linear function of the test scores, on the 
other, is maximised. This is the procedure 
followed here. It involves calculating a prod- 
uct-sum matrix, G, for the combined groups 
on the tests concerned, and a between-groups 
product-sum matrix, B. The problem then is 
to find a set of weights, i.e., a column vector, 
u, which will maximise the expression: 


= Bu/u'Gu (2) 


where w’ is the transpose of u. 

On differentiating this expression with re- 
spect to uw and equating the result to zero, the 
determinental equation which gives the re- 
quired values of u« and the corresponding 
values of »’ is obtained. It is: 


(G"B — =0 


where / is a unit diagonal matrix. 

The solution of equation (3) involves find- 
ing the latent roots and latent vectors of the 
asymmetric matrix G-*B (8). The number of 
latent roots possible will depend on the rank 


(3) 
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of the matrix. This can be shown (7) to be 
the number of tests or one less than the num- 
ber of groups, whichever is smaller. — 

It follows that when only two groups are 
being considered equation (3) will give, at 
most, one latent root. This is in agreement 
with the fact that one dimension is sufficient 
to account for two mean points. In our ex- 
ample, five groups and four tests are con- 
cerned, so that the rank of our G-"'B matrix 
is four, and the matrix will have four latent 
roots. However, these roots may not all be 
significant. The number which is significant 
indicates the number of dimensions necessary 
and sufficient to account for the group means. 

The assumptions involved when the above 
method is employed are not always clearly 
stated. In the first place, the variance—covari- 
ance matrices for the individual groups should 
be homogeneous, otherwise one is not justi- 
fied in combining the groups when getting the 
total product-sum matrix. Secondly, ii is re- 
quired that the scores for the combined group 
on the several tests give a multivariate nor- 
mal distribution; for unless this is so, one is 
not justified in setting up /Jimear functions of 
the raw test. 


Analysis of the Data 


The original raw scores on each of the tests 
were not normally distributed and, after in- 
spection, a logarithmic transformation was 
employed to stabilize their variances. This 
had the additional advantage of making the 
data more normal. 

Inspection of the standard deviations, given 
in Table 2, reveals some discrepancies. In 
each test the standard deviation for the 
brain-damaged group is the largest and on 
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the first test is roughly three times as large 
as that for any other group. To check on the 
significance of these differences, F ratios be- 
tween variances on the tests were performed. 
These showed no differences between the 
variances on Test 4, time, but on Tests 1, 
disproportionality, 2, colour, and 3, rotation, 
the variances for the brain-damaged group 
were found to differ significantly from that 
of the group with the lowest variance on each 
of the tests. In other words, homogeneity of 
variance between the groups on Tests 1, 2, 
and 3 has not been achieved. 

As a result, our data are not ideally suited 
to the statistical model proposed since the 
brain-damaged group has an advantage over 
the other two groups in that it is more highly 
weighted on account of its larger standard 
deviations. However, from a practical point 
of view, this may not be a bad thing since 
the brain-damaged patients are precisely 
those which we are most interested in dif- 
ferentiating from the others. 

There is no simple way of testing for ho- 
mogeneity of covariance between covariance 
matrices derived from small samples, but 
rank-order correlation coefficients between the 
tests for each sample separately and the 
product-moment correlations for the com- 
bined samples were calculated. The results 
gave no indication of gross departures from 
homogeneity of covariance between the groups. 
The product-moment correlations for the com- 
bined groups are given in Table 3. 

In view of the findings in these preliminary 
investigations, it appeared reasonably legiti- 
mate, though not ideally so, to proceed with 
the calculations and obtain the latent roots 
and corresponding latent vectors of the ma- 


Table 2 
Means and Standard Deviations of Transformed Drawing Test Scores 


Test 1 Test 2 Test 3 Test 4 
Group Mean SD Mean SD Mean SD Mean SD 
Normal 0.1219 0.0584 1.0400 0.0548 0.6265 0.1882 0.0482 0.2004 
Brain damaged 0.1870 0.1411 1.0948 0.0776 0.8297 0.3754 1.4399 0.2364 
Depressive 0.1251 0.0520 1.0522 0.0400 0.7003 0.2817 1.3284 0.1995 
Schizophrenic 0.0868 0.0420 1.0139 0.0362 0.6620 0.2442 1.3748 0.2166 
Neurotic 0.0983 0.0384 1.0300 0.0436 0.5846 0.1894 1.2412 0.2004 
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Table 3 


Correlations Between the Measures for the 
Total Group 


trix GB. The latent roots in order of ex- 
traction were 0.3849, 0.2231, and 0.0137; the 
percentage variance accounted for by each 
being 61.9, 35.9, and 2.2, respectively. The 
first two roots were found to be highly sig- 
nificant, while the third was not significant 
(1). This showed that the means of the 
groups on the tests could be represented ade- 
quately in a two-dimensional space. 

The latent column vectors, corresponding 
to the first two latent roots, were then found. 
They are, respectively: 


— 0.2194 1.0000 0.0317 0.6552 


and 


1.0000 0.8018 —0.0746 — 0.1922 


Using these values as weights in the equa- 
tion (7), two linear functions of the test 
scores are obtained: 


X, = — 0.2194x, + 1.0000x. 

+ 0.0317xg + 0.6552x,4 
Xo = 1.0000x, + 0.8018x2 

— 0.0746x3 — 0.1922x4 


(4) 
(5) 


On substituting, in turn, the scores of each 
individual in these equations, composite scores 
for each on the new variates X, and X» are 
obtained. These variates are sometimes called 
“canonical variates”; they are orthogonal to 
each other. The correlation ratio between the 
X, scores and the groups is given by the 
square root of the first latent root, namely, 
0.6204. It is the maximum correlation ratio 
possible between the groups and any linear 
function of their scores. The X» scores, in 
turn, give a correlation ratio with the groups 
equal to the square root of the second latent 
root, that is, 0.4723. 
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In Fig. 2 the means of the canonical vari- 
ate scores for the five groups are plotted. 
From this figure a picture of the relative po- 
sitions of the group means is obtained. 


Classification 


At this stage, a further aspect of the prob- 
lem of discriminating between the groups can 
be considered. This involves calculating for 
each individual the likelihood of his belong- 
ing to each of the five groups and assigning 
him to that group for which his likelihood is 
greatest. The statistical procedure for doing 
this is given by Rao (9), and in less advanced 
mathematical language by Lubin (7). By em- 
ploying it, it is possible to partition either the 
original test space, or, alternatively, the ca- 
nonical variate space between the groups. The 
calculations involved are heavy, but fortu- 
nately, as shown in Eysenck’s paper (3), 
when only two variables are concerned the 
partitioning can be done satisfactorily by eye. 
Taking the original allocation to diagnostic 
categories as correct, lines radiating from a 
central point in the scattergram can be found 
which give a minimum of misclassification. 
For the sake of clarity, two scattergrams have 
been used here. Figure 3 contains the points 
for the normal, schizophrenic, and brain- 
damaged groups only, while Fig. 4 shows 
those for the depressive and brain-damaged 
groups only. 


| 


CANONICAL VARIATE | 


Fig. 2. Relative positions of the mean canonical vari- 
ate scores for the five groups. 


Tests 
Tests 1 2 3 4 
1 0.62 0.59 0.30 
2 0.44 0.12 
3 0.30 
| BRAIN DAMAGEDe 
| 
DEPRESSIVE 
«SCHIZOPHRENIC 
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canonical variate | 


Fig. 3. Individual canonical variate scores for the 
normal, schizophrenic, and brain-damaged groups. 


Discussion of the Results 


The discriminant function analysis tells us 
the minimum number of independent dimen- 
sions necessary and sufficient to account for 
the information contained in the test results 
and assigns weights to the tests which maxi- 
mize the differences between the group means 
along these dimensions. The test scores can 
now be replaced by the composite scores on 
the canonical variates—the means of which, 
for each group separately, are plotted in 
Fig. 2. The advantages of doing so are dis- 
cussed below. 

Comparing the F ratios and ¢ tests for the 
individual measures (Table 1) with those for 
the canonical variates (Table 4), the superi- 
ority of the latter as an aid to differentiating 
between the groups may not immediately be 
obvious, for the levels of significance have not 
been increased very noticeably. However, a 
glance at Table 3 shows that the four tests 
are correlated positively and to a consider- 
able extent with each other. As a result, the 
significance levels between the groups shown 
by the separate tests are not independent of 
each other and the amount of information 
available to us for discriminating between the 
groups appears greater than it really is. On 
. the other hand, the two canonical variates are 
uncorrelated, so that the significance levels 
for the separate variates in Table 4 are in- 
dependent. In other words, from the fact that 
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Variate 1, for example, differentiates between 
two groups, we can draw no conclusion as to 
what result will be obtained when the same 
two groups are compared on Variate 2. 

A further advantage of obtaining the ca- 
nonical variates is that, whereas a four-di- 
mensional space would be required to locate 
our Ss if their scores on the four tests are 
used, the fact that only two latent roots were 
found to be significant means that the ca- 
nonical variates allow us to locate the Ss, 
without loss of information, in a two-dimen- 
sional space (see Fig. 2). The advantage thus 
achieved cannot be overemphasized. 

Like factors in a factor analysis, canonical 
variates are, as seen from equation (1), 
weighted measures of the test scores. For 
this reason there is a natural tendency to 
try to interpret canonical variates psycho- 
logically, just as factors are interpreted. This 
tendency is not one to which the investigator 
should succumb without due caution. For one 
thing, invariance of factors from one study to 
the next is achieved by rotation, so that the 
factors rotated to a psychologically meaning- 
ful position are not wholly dependent upon 
the selection of tests used in the study. On 
the other hand, the weights used to define 
a canonical variate are unique to the study 
in question and to the tests and groups used 
in that study. Moreover, these weights, like 
regression weights in general, are known to 
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Fig. 4. Individual canonical variate scores for the 
depressive and brain-damaged groups. 
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vary notoriously from one study to another, 
especially so when the composition of the 
battery is altered. However, with this warn- 
ing in mind, it will be helpful when discussing 
the differences between our groups to give 
psychological labels to the two canonical vari- 
ates. For instance, canonical Variate 1, with 
high loadings on Tests 2 and 4, might tenta- 
tively be thought of as defining a normality- 
abnormality dimension, normals appearing at 
one extreme, neurotics in the middle, and the 
grossly abnormal groups at the other extreme. 
We have, however, no ready interpretation 
for the second canonical variate which is 
heavily weighted on Tests 1 and 2. Defining 
this dimension poses a problem toward which 
future research may be directed. 

Now let us concentrate for a moment on 
Fig. 2. We may recall that the canonical 
variates, represented in it by the orthogonal 
axes, have the property of separating the 
group means from each other to the optimum 
degree. The mean points for the groups show 
that brain-damaged patients tend to have 
high scores on both canonical variates. Their 
mean is the most extreme of the means on 
both dimensions. 

On the normality-abnormality dimension, 
the mean of the brain-damaged group oc- 
cupies the most extreme position at the 
“abnormal” end, with the schizophrenic and 
depressive groups next, their means not dif- 
fering significantly from that of the brain- 
damaged group. On the second dimension, the 
brain-damaged group mean together with the 
normal group mean are at one extreme and 
are opposed to the schizophrenic and neu- 
rotic group means at' the other extreme 
with the depressives occupying an intermedi- 
aie position. More precisely, the brain-dam- 
aged group mean is differentiated, on the one 
hand, from the normal and neurotic groups’ 
means at a high level of significance (p < 
.01) by canonical Variate 1, and, on the 
other hand, from the schizophrenic, neurotic, 
and depressive groups’ means (p again being 
less than .01) by canonical Variate 2. 

This would appear to be of value from a 
diagnostic viewpoint. Of importance, too, is 
the fact that whereas the normal Ss tend to 
have low scores on the “normality-abnormal- 
ity” dimension, the situation is the reverse 


Tabie 4 


Significance of Differences Between the Groups on the 
Canonical Varizie Scores 


F ratios 


Canonical! Variate I 
21.73** 


Canonical Variate IT 
10.226** 


(2-tail) 


Canonica! Variate 


Groups compared II 


Brain damaged v. normal 
Brain damaged v. depressive 
Brain damaged v. schizophrenic 
Brain damaged v. neurotic 
Normal v. depressive 

Normal v. schizophrenic 
Normal v. neurotic 

Depressive v. schizophrenic 
Depressive v. neurotic 
Schizophrenic v. neurotic 


1.379 
2.629** 
5.770** 
3.425** 
1.674 
4.932** 
2.482* 
2.340* 
0.569 
1.824 


* Significance of .05 or above. 
** Significance of .01 or above. 


for the schizophrenic group, the means for the 
respective groups differing in each case be- 
yond the .001 level of significance. Indeed, a 
careful study of the ¢ test results shown in 
Table 4 indicates that the canonical variates 
enable us to distinguish all group means 
from each other significantly on one or other, 
or on both, of the variates with the exception 
only that the neurotic group mean can be 
differentiated on neither variate from those 
of the depressive and schizophrenic groups. 
However, the main purpose of this investiga- 
tion was to differentiate between the brain- 
damaged group on the one hand and the re- 
maining groups on the other, and this pur- 
pose has been realized. 

Before going on to consider the identifica- 
tion of individual Ss, we might note that a 
more meaningful position of the variates 
might be accomplished by rotation to a po- 
sition approximately 45 degrees in a clock- 
wise direction. We might then preserve the 
“normality-abnormality” dimension, and em- 
phasize the interesting relative position oc- 
cupied by the brain-damaged group. Such a 
position, possibly reflecting qualitative dif- 
ferences between brain-damaged and other 
groups, psychiatric and normal, might sug- 
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6.321** 
1.648 
1.868 
3.029** 
3.429** 
3.879** 
2.195* 
0.001 
1.112 
1.222 
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gest fruitful lines along which further investi- 
gation could be directed. 

Differentiating groups, however, is not the 
same thing as diagnosing individual patients, 
and so we turn to the second aspect of our 
study (Figs. 3 and 4) which deals with the 
problem of classification. Ideally, the ca- 
nonical variate space should be partitioned 
into five distinct compartments, one for each 
group, in such a way that the number of mis- 
classifications is a minimum. This, as noted 
earlier, can be done efficiently by eye when 
the space is one of only two dimensions. How- 
ever, as the means of our neurotic group on 
the two variates are not differentiated from 
those of the depressive and schizophrenic 
groups—though the latter are differentiated 
from each other on Variate 2—it was de- 
cided, in the interest of clarity, to use two 
figures rather than one for classification pur- 
poses, although this is not the ideal procedure. 

In the first figure (Fig. 3), the members of 
the brain-damaged, schizophrenic, and nor- 
mal groups are represented, and the variate 
space is partitioned between them so as to 
achieve a minimum number of misclassifica- 
tions. A count of the number of individuals 
correctly and incorrectly classified is given in 
Table 5. The table shows the percentage of 
correct classificati.as to be high, although 
there is still consid.:able overlapping between 
the schizophrenic and brain-damaged groups, 
7 members out of the 33 of the latter group 
falling in the schizophrenic compartment. 

In Fig. 4 only the brain-damaged and de- 
pressive groups are represented. The results 
of partitioning the canonical variate space be- 


Table 5 


Identifications Comparing Normal, Schizophrenic, 
and Brain-Damaged Groups 


Actual diagnosis 


Schizo- Brain- 
Test diagnosis Normal phrenic damaged 
Brain-damaged 
Schizophrenic 
Normal 


Total 
Percentage correct 
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Table 6 


Identifications Comparing Brain-Damaged and 
Depressive Groups 


Actual diagnosis 


Brain- 


damaged 


Brain-damaged 4 20 
Depressive 13 13 


Test diagnosis Depressive 


Total 17 33 
Percentage correct 76.5% 60.6% 


tween them are shown in Table 6. Here again, 
although the percentages of correct classifica- 
tions are fairly high, 13 out of the 33 brain- 
damaged Ss fall in the depressive compart- 
ment. Indeed, we must conclude that our 
efforts at differentiating the brain-damaged 
patients from the other groups, as seen from 
both figures, are not quite as successful as 
one might have hoped. The difficulty is due 
largely to the fact noted earlier that the vari- 
ance of the brain-damaged group on the sev- 
eral tests is greater than that of the other 
groups. This greater variation is reflected in 
the canonical variate scores with the result 
that, as seen in Figs. 3 and 4, the brain-dam- 
aged patients are widely scattered. 

It should be noted that scores on canoni- 
cal Variate 1 are related to intelligence in a 
way indicating that high scores on the for- 
mer are associated with low scores on the 
latter. As the brain-damaged group were 
found to be significantly less intelligent than 
the other groups (2), it is clear that some of 
the differentiation between this group and 
others might be a function of differences in 
intellectual status. Consequently, an analysis 
of covariance was performed on canonical 
Variate 1 scores, equating the groups for in- 
telligence level. This resulted in a general 
slight reduction in significance levels of dif- 
ferences between the groups. This procedure 
was not necessary in the case of canonical 
Variate 2 which is not related to intelligence 
level. 


Summary 


Four measures of discrepancies in the re- 
production of simple designs by drawing were 
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taken for each individual in five groups: nor- 
mal, neurotic, depressive, schizophrenic, and 
brain-damaged. In view of theoretical and 
practical difficulties encountered in differenti- 
ating satisfactorily between these groups on 
the measures, the technique of discriminant 
function analysis was applied to the data in 
the hope that the amount of misclassification 
would be reduced. 

The statistical model was briefly described 
and also the advantages in using the particu- 
lar technique chosen. 

In terms of levels of significance, the dif- 
ferences between the groups on the canonical 
variate scores were not strikingly superior to 
those obtained on the measures before ap- 
plying the statistical technique. It might be 
noted that a similar conclusion was reached 
by Beech (2) after comparing the number of 
subjects identified as belonging to one or 
other of the groups under the two conditions. 
However, it was pointed out that certain dis- 
tinct advantages were to be gained by em- 
ploying the discriminant function method to 
the data, and it was clear that the test meas- 
ures combined in this way were clinically use- 
ful, especially where normals, schizophrenics, 
and brain-damaged subjects were concerned. 

A tentative interpretation of one of the two 
canonical variates was put forward. The sec- 
ond canonical variate presents a problem in 
interpretation. 

Finally, Vector 1 scores were found to be 
related to intelligence level, but it was re- 
ported that only slight reductions in the sig- 
nificance of differences between the groups 
resulted when the groups were equated for 
level of intelligence. 
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The Yacorzynski Block Technique: 
A Cross-Validation Study’ 


B. G. Rosenberg 


Bowling Green State University 


and John Altrocchi 


Duke University 


In a recent study (1) the writers described 
a new sorting technique for diagnosing or- 
ganic brain damage. The Yacorzynski Block 
Technique (YBT) (3) consists of 16 blocks 
which the subject is asked to sort into four 
groups of four blocks each on the basis of 
some concept. There are eight possible solu- 
tions: (a) shape of block, (5) color of block, 
(c) shape of figure on block, (d) color of fig- 
ure, (e) height, (f) volume, (g) area of top 
of block, and (A) area of figure. Apparently 
a normal subject can achieve four concepts. 
The block test discriminated organics from 
nonorganics at a level significantly greater 
than would be expected by chance (1). 

In an effort to cross-validate these findings, 
50 male inpatients from a large VA GM&S 
hospital were tested. The sample consisted of 
all patients referred for testing during a ten- 
month period concerning whom there was any 
question of organic brain damage. The YBT 
was administered as one of a battery of tests 
by one of several different psychologists. The 
test administrator sometimes did and some- 
times did not know a patient’s criterion diag- 
nosis, which was established independently 
by the chief neurologist. A patient was con- 
sidered organic (N = 28) if there was neuro- 
logical evidence of damage rostral to the 
foramen magnum, and nonorganic (N = 22) 

1An extended report of this study may be ob- 
tained without charge from B. G. Rosenberg, Dept. 
of Psychology, Bowling Green State University, 
Bowling Green, Ohio, or for a fee from the Ameri- 
can Documentation Institute. Order Document No. 
5484, remitting $1.25 for microfilm or $1.25 for 
photocopies. 


if there was no evidence of such pathology 
(2). 

Patients were classified as organic or non- 
organic from the YBT on the basis of two 
slightly different cutoff points—scores of four 
minus and four (1). Both systems correlated 
better with the criterion diagnosis than would 
be expected on the basis of chance (cutoff 
four minus: 402, p= < .01; cutoff 
four: ¢ = 345, p= < .02 > .01). Age differ- 
ences were not significant for the two groups 
(organics: M = 41.36, SD = 15.02; nonor- 
ganics: M = 38.18, SD = 12.07; p= > .10). 
Age was not significantly correlated with per- 
formance on the YBT (r = — .09, p > .10). 

The results of this study tend to corrobo- 
rate the findings in the original study. The 
four-minus cutoff is recommended for clini- 
cal situations in that it provides a conserva- 
tive estimate of impairment, and tends to 
separate organics from other patients with 
fewer nonorganics misclassified as organics. 


Brief Report. 
Received December 4, 1957. 
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Rapaport (2) listed approximately 13 dif- 
ferent types of deviant Rorschach responses 
which are characteristically given by schizo- 
phrenic subjects. He divided these 13 types of 
responses into two broad groups, those which 
point conclusively to the presence of a schizo- 
phrenic disorder and those which point to the 
possible presence of a schizophrenic disorder. 
There are six deviant response types in the 
former group and seven deviant response types 
in the latter group. Rapaport discussed each 
type of response in terms of his “loss of dis- 
tance” and “increase of distance” rationale. 
Responses showing a loss of distance are in- 
flexibly and inappropriately bound by the 
configuration of the Rorschach blots, while 
responses showing an increase of distance are 
insufficiently bound by the configuration of 
the blots. 

Rapaport did not assign quantitative values 
to any of his deviant response types. Watkins 
and Stauffacher (3), however, did a study in 
which the initial step consisted of the assign- 
ment of quantitative values to most of Rapa- 
port’s deviant response types. Values of either 
.25, .50, or 1.00 were assigned to each deviant 
response type, with a value of .25 being as- 
signed to mildly deviant response types and a 
value of 1.00 being assigned to extremely de- 
viant response types. 

Using their scoring system, Watkins and 
Stauffacher independently scored 25 normal, 
25 neurotic, and 25 schizophrenic Rorschach 
records. Their combined scoring significantly 
differentiated the schizophrenic group from 
the normal group and from the neurotic 
group, and also differentiated the normal 
group from the neurotic group. Powers and 


Hamlin (1) replicated Watkins and Stauf- 
facher’s study, with similar group differentia- 
tions being obtained. 

Reliability of scoring in both Watkins and 
Stauffacher’s and Powers and Hamlin’s studies 
was based on the total scores assigned to the 
Rorschach records by the judges, with all 75 
Rorschach records being scored by the two 
judges in the former study and 15 randomly 
selected records being scored by the two 
judges in the latter study. The reliability co- 
efficients wee, respectively, .78 and .88, and 
both coefficients were highly significant. 

Powers and Hamlin, however, pointed out 
the limited agreement between the two judges 
in singling out responses as scoreable. Hamlin 
did not assign any score at all to 35 per cent 
of the 133 responses which Powers had scored 
as pathological in one way or another, while 
Powers did not assign any score at all to 38 
per cent of the 142 responses which Hamlin 
had scored as pathological. 

The degree of agreement between Powers 
and Hamlin was even more limited when the 
assignment of responses to the specific scor- 
ing categories was considered. Of the 133 re- 
sponses scored by Powers, 86 per cent were 
either not scored at all by Hamlin or were 
assigned by him to a different scoring cate- 
gory; while of the 142 responses scored by 
Hamlin, 88 per cent were either not scored 
at all by Powers or were assigned by him to 
a different scoring category. To say the least, 
therefore, the scoring categories were not 
easily differentiable, and Powers and Hamlin 
suggested that Rapaport’s deviant response 
types should either be refined or revised. 
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The Study 


The present study rests on a complete re- 
vision of Rapaport’s deviant response types. 
Powers re-evaluated Rapaport’s descriptions 
of his deviant response types, as well as the 
Rorschach responses which Rapaport lists as 
examples for each type of deviant response. 
As a result of this procedure, Powers isolated 
10 continua which seemed to be either ex- 
plicit or implicit in Rapaport’s material and 
which seemed to be relatively distinct and 
homogeneous. 

A short description of each continuum is as 
follows, with an extreme example of each con- 
tained in parentheses following the descrip- 
tions: 


Continuum I, Cohesiveness—Loosely integrated re- 
sponses. (Card VIII, W. A room. [What makes it 
look like this?] Different colored furniture, different 
shapes.) 

Continuum II, Confusion—Any confusion on the 
part of the subject in communicating his response to 
the examiner. (Card I, W. Shadow pictures... 
desert picture... you lift up your hands in a 
desert picture . . . shadow what bothers you.) 

Continuum III, Form level—The degree of corre- 
spondence between the percept and the blot area 
used. (Card VII, W. It is not a shoelace, is it?) 

Continuum IV, Mangling—Mangled or withered 
content. (Card VII, W. What is left of a bat, mashed 
up by a truck, part of it decayed.) 

Continuum V, Social acceptability—Taboo con- 
tent, generally of a sexual nature. (Card II, W. In- 
tercourse; glorified kiss.) 

Continuum VI, Deterioration color—Gory color re- 
sponses. (Card IX, W. Not a healthy color . . . looks 
almost gangrenous.) 

Continuum VII, Increase of distance—Any tend- 
ency to go beyond the blot with insufficient justifi- 
cation. (Card VI, W, each half. Like going from 
Helsingfors to Leningrad . . . that kind of country; 
at dusk; thinly populated . . . lonely feeling.) 

Continuum VIII, Loss of distance—Any tendency 
to be inflexibly and inappropriately bound by the 
blot configuration. (Card IX, W. Sex organ (upper 
two-thirds), female ... these two (lower pink) 
seem to be men looking at it.) 

Continuum IX, Threat—Any response which shows 
anger, aggression, or fear. (Card VIII, upper middle 
space. Horrible face . . . fierce . . . twitching lips.) 

Continuum X, Affect—Overt expression of positive 
or negative feelings toward a given card or a given 
response. (Card IV. A hideous picture.) 


Once the 10 continua had been isolated, an 
analysis of the characteristics of each con- 
tinuum suggested that they could readily be 
combined into four large classes. Continua I, 
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II, and III were combined to form Class A, 
intellectual disorganization; Continua IV, V, 
and VI were combined to form Class B, devi- 
ant content; Continua VII and VIII were 
combined to form Class C, inappropriate in- 
crease or loss of distance; and Continua IX 
and X were combined to form Class D, affec- 
tive response. 

It was hypothesized that schizophrenics have 
a significantly higher or more pathological 
mean score on each of the four classes than 
either normals or neurotics, and it was also 
hypothesized that neurotics have a signifi- 
cantly higher mean score on each of the four 
classes than normals. 

No direct comparisons with the studies 
using Rapaport’s deviant response types are 
possible because of differences in method. In- 
direct comparisons will be made, however, 
concerning the respective reliabilities and va- 
lidities. 

Method 
Subjects 


Seventy-five Ss were used, 25 normals, 25 
neurotics, and 25 schizophrenics. All of the 
Ss were white, all were between 20 and 40 
years of age, all had at least 100 IQ, and all 
gave at least 15 Rorschach responses. No S 
was included in the experiment who did not 
meet all of these qualifications. 

The three groups of Ss were equated for 
sex, there being 11 males and 14 females in 
each group. They were also roughly equated 
for the variables of age, IQ, and education, 
with the schizophrenic Ss, as might be ex- 
pected, being generally older than the nor- 
mal and neurotic Ss and also having gener- 
ally less education and lower IQ ratings. 

All of the normal Ss were selected on the 
basis of adequate adjustment, with no his- 
tory of psychiatric treatment and no history 
of psychiatric disability in any members of 
the immediate family; all of the neurotic Ss 
were out-patients and no Ss were used who 
were deemed by the therapist to be a char- 
acter disorder, to have marked schizoid traits, 
or whose adjustment to reality was somewhat 
tenuous; all of the schizophrenic Ss had been 
hospitalized for periods ranging from one 
year through twelve years, with the diagnosis 
being quite definite for each case. 


Table 1 


Distribution of Scores for Class A, 
Intellectual Disorganization 


Deviant Rorschach Response Characteristics 


Table 3 


Distribution of Scores for Class C, Inappropriate 
Increase or Loss of Distance 


Group 


Schizo- 


Score phrenics 


Normals 


Neurotics 


Group 


Schizo- 
Score phrenics 


Normals Neurotics 


10.0 0 0 0 
10.0-10.9 18 19 1 
11.0-11.9 6 5 5 
12.0-12.9 0 0 1 
13.0+ 1 1 18 


10.0 2 0 1 
10.0-10.9 16 6 7 
11.0-11.9 5 12 7 
12.0-12.9 2 6 3 
13.0+ 0 1 7 


Procedure 


A scoring scale was drawn up for each of 
the 10 continua, ranging in five-point intervals 
from Point 10 to Point 50. Point 10 consisted 
of an absence of deviance with respect to the 
characteristics of each continuum, while Point 
50 consisted of extreme deviance with re- 
spect to the characteristics of each continuum. 
Point 30 and Point 50 of each continuum 
were anchored by five illustrative Rorschach 
responses, taken almost entirely from Rapa- 
port’s examples. In addition, a description of 
the characteristic quality of each continuum 
was drawn up for the use of the scorers. 

One judge scored all 75 Rorschach records. 
Each response in every record received 10 
separate scores, one score for each of the 10 
continua. For the purpose of obtaining an 
estimate of scoring reliability five Rorschach 
records were chosen at random from each of 
the three S groups and these 15 Rorschach 
records were scored independently by a second 


Table 2 


Distribution of Scores for Class B, 
Deviant Content 


judge following a period of training in the 
scoring system. 

The four class scores for each S were ob- 
tained by averaging the scores of the two or 
three continua making up each class. 


Results 
Validity 

Tables 1, 2, 3, and 4 contain the distribu- 
tion of scores for each of the three groups of 
Ss for Classes A, B, C, and D respectively. 
These class scores were obtained by averag- 
ing the scores of the constituent continua. 

Table 5 gives the ¢ ratios obtained in the 
various intergroup comparisons, with one as- 
terisk indicating significance at the .05 level 
and two asterisks indicating significance at 
the .01 level. 

The significant differences contained in 
Table 5 were all in the predicted direction 
with the exception of the neurotic-schizo- 
phrenic comparison for Class D. The neurotic 


Table 4 


Distribution of Scores for Class D, 
Affective Response 


Score 


Group 


Score 


Normals Neurotics 


10.0 8 9 
10.0-10.9 7 15 12 
11.0-11.9 0 1 2 
12.0-12.9 0 0 0 
13.0+ 0 0 


10.0 4 3 4 
10.0-10.9 14 10 7 
11.0-11.9 4 5 2 
12.0-12.9 3 5 1 

0 2 1 


13.0+ 
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Table 5 


t Ratios Based on the Comparison of the Mean Score 
Values of Each of the Two Classes for 25 
Normal, 25 Neurotic, and 25 
Schizophrenic Subjects 


t Ratios 
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Table 7 


Intercorrelations of the Four Classes 
(N = 75) 


6193 3.54°° 


1.16 1.81 


0.10 3.18°*  3.76°° 


group had a significantly higher mean score 
than the schizophrenic group for this class. 

It may be seen from Table 5 that two of 
the four possible comparisons were signifi- 
cant in each of the three intergroup compari- 
sons. Class A and Class C together accounted 
for four of the six significant differences. It 
may be seen that Class A significantly differ- 
entiated the schizophrenic group from the two 
nonpsychotic groups, while Class C signifi- 
cantly differentiated the normal group from 
the two patient groups. 

Table 6 contains the distribution of total 
scores for each group, with the total score for 
each § being obtained by summating the four 
class scores. 

Three intergroup comparisons were made 
using total scores and all three were signifi- 


Table 6 


Distribution of Total Scores, Using the Summation of 
Scores for All Four Classes for Each Subject 


Group 


Schizo- 
Normals Neurotics phrenics 


cant at the .01 level. The ¢ ratios for the 
normal-—schizophrenic, neurotic—schizophrenic, 
and normal—neurotic comparisons were, re- 
spectively, 4.05, 3.27, and 2.95. 

Table 7 contains the intercorrelations of the 
four classes. 

It may be seen from Table 7 that Classes 
A, B, and C have rather high intercorrela- 
tions, while the correlation of each of these 
three classes with Class D is negligible. 


Scoring Reliability 


Reliability coefficients were computed for 
the 10 continua and the four classes. For 
Continua I through X the reliability coeffi- 
cients were, respectively, .63, .37, 35, .08, .36, 
54, .33, .82, and .76; while for Classes A, B, 
C, and D the reliability coefficients were .57, 
.68, .81, and .89. 

Since each class was composed of two or 
three continua, the correlation coefficients of 
the continua making up each class were aver- 
aged in order to find some explanation for the 
fact that the reliability coefficients for the 10 
continua were not more imposing. These corre- 
lation coefficients were respectively for Classes 
A, B, C, and D, .46, .73, .44, and .79. These 
values are lower than the corresponding reli- 
ability coefficients for three of the four classes, 
and considerably so for Class C. These data, 
in combination with inspection of the scoring 
of the - ds, indicate that the two scorers 
were .irly well agreed concerning the degree 
of deviance of responses with respect to each 
class but that the two scorers were much less 
in agreement as to which constituent con- . 


tinuum best described the deviance. The two - 


judges tended to disagree to some extent in 
assigning responses to Continua I or III of 
Class A and Continua IX or X of Class D. 


Class 
Class B Cc D 
Groups Class Class Class Class A 65 65 —.09 
Compared A B Cc D B 69 03 
Normal- 
Schizophrenic 
Neurotic- 
Schizophrenic 230° 
Normal- 
Neurotic 

40.0 0 0 0 

40.0-41.9 7 3 1 

42.0-43.9 14 11 2 

44.0-45.9 3 8 So 
46.0-47.9 1 2 4 

48.0-49.9 0 0 4 

50.0+ 0 1 9 


These differences were not extreme, however, 
and might well be lessened by a more succinct 
formulation of these continua. The two 
scorers, on the other hand, disagreed to a 
considerable extent in assigning responses to 
Continua VII or VIII of Class C. It seems 
clear that these two continua should be com- 
bined into one continuum, since duplication 
existed in the minds of the scorers if not in 
the descriptions of the continua. Logically, it 
is a simple matter to separate these two op- 
posed ways of responding to the Rorschach 
blots. In actuality, however, it is apparently 
much simpler for a judge to indicate that a 
given response deviates from the usual mode 
of responding to the inkblots to a greater or 
a lesser extent than it is for the judge to 
characterize this deviation as manifesting pri- 
marily “increase of distance” or primarily 
“loss of distance” from the blots. 
Discussion 

In the previous studies (1, 3) making use 
of Rapaport’s formulations, the judges were 
obviously making use of general impressions 
obtained from Rapaport, since they agreed so 
little in assigning responses to specific scoring 
categories. The present study was based on 
the assumption that these general impressions 
could be isolated from Rapaport’s material, 
and the authors feel that the four classes of 
the present study represent the core of these 
impressions. 

As stated earlier, no direct comparison is 
possible with respect to validity and scoring 
reliability between the present study and the 
two previous studies which made use of Rapa- 
port’s deviant response types. This is so be- 
cause the present study does not make use of 
one overall score in determining validity and 
reliability but, instead, has broken the over- 
all score down into four large classes. We can 
make a validity comparison on the basis of 
optimal cutoff points, using the total score 
from Powers and Hamlin’s study (1) and 
using the single most effective class in each 
intergroup comparison in the present study. 
This comparison suggests the two methods 
are roughly equal in effectiveness in making 
the same intergroup discriminations. If we 
consider the distribution of total scores con- 
tained in Table 6, the present study is even 
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more effective comparatively, with a cutoff 
score of 46.0 correctly identifying 18 schizo- 
phrenic Ss and misidentifying, respectively, 
only one normal S and three neurotic Ss. 
Even more effective in discriminating between 
groups, however, is the combination of scores 
from Classes A and C for each S. In the lat- 
ter instance, a cutoff score of 24.0 correctly 
identifies 19 schizophrenic Ss while misidenti- 
fying only one normal S and two neurotic Ss. 
The scoring reliability coefficients in both 
Watkins and Stauffacher’s and Powers and 
Harrlin’s studies were highly significant and 
quite imposing in size. The scoring reliability 
coefficients of the four classes used in the 
present study were also highly significant. 
Furthermore, there is every reason to expect 
that these latter values can be increased by 
better delineation of the continua and the 
amalgamation of some of the continua. The 
set of categories developed in the present 
study seem to have greater potentiality for 
improvement than do those developed by 
Rapaport. Moreover, the four classes appear 
to have more psychological relevance than 
Rapaport’s categories, and these four classes 
concern four important and relatively non- 
overlapping aspects of Rorschach productions. 
The category of “intellectual disorganization,” 
for example, is much more communicable and 
self-evident in meaning than are categories 
such as “fabulized-combination” or “con- 
tamination.”” Such obscure terms as the two 
latter ones tend to cast a pall of mystery over 
Rorschach interpretation and tend to isolate 
the Rorschach “magician” from those of his 
associates who prefer to deal with more con- 
crete terms. Furthermore, terms such as “in- 
tellectual disorganization” are much more 
amenable to statements concerning degree 
than are the previously mentioned terms, 
which almost seem bound to a “sign” ap- 
proach. May we be delivered from signs and 
the occult. 


Summary 


Rapaport’s material concerning deviant ver- 
balizations on the Rorschach test was ana- 
lyzed by one of the authors, and 10 continua 
were isolated, which seemed to be present ex- 
plicitly or implicitly in this material. An 
analysis of the characteristics of these 10 
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continua showed that they could be combined 
into four large classes, namely, intellectual 
disorganization, deviant content, inappropri- 
ate increase or loss of distance, and affective 
response. Scoring scales were devised for each 
of the 10 continua, and it was hypothesized 
that the scoring reliability coefficients for each 
of the 10 continua would prove to be statisti- 
cally significant, as well as the scoring reli- 
ability coefficients for each of the four classes. 

It was further hypothesized that schizo- 
phrenic Ss have significantly higher or more 
pathological scores on each of the four classes 
than normal Ss and neurotic Ss and, also, that 
neurotics have significantly higher scores on 
each of the four classes than normal Ss. 

The scoring reliability coefficients of all 
four classes proved to be significantly reliable, 
but only five of the ten continua proved to be 
significantly reliable. In each of the three pos- 
sible intergroup comparisons, only two of the 
four classes proved to be significant in each 
comparison, and one of the two significant dif- 
ferences in the neurotic-schizophrenic com- 
parison was the opposite from that predicted. 

Class A, intellectual disorganization, signifi- 
cantly differentiated the schizophrenic group 
from the two nonschizophrenic groups, while 
Class C, inappropriate increase or loss of dis- 
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tance, significantly differentiated the normal 
group from the two patient groups. These two 
classes were the most effective classes in ef- 
fecting group differentiations. 

No direct comparisons are possible between 
the results of the present study and the re- 
sults of the two previous studies making use 
of Rapaport’s categories because of the dif- 
ferences in methods. Indirect comparisons, 
however, suggest that the present results are 
at least equal to those of the previous two 
studies insofar as scoring reliability and in- 
tergroup differentiations are concerned. Fur- 
thermore, the four classes isolated in the pres- 
ent study appear to have more psychological 
relevance and are more communicable and 
less occult than are the previous categories. 
Future studies are contemplated using the re- 
vised versions of the 10 continua. 


Received April 17, 1957. 
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Rorschach Concept Evaluation Test as a 
Diagnostic Tool’ 


Laverne C. Johnson 
Washington University School of Medicine and Research Laboratories 


Skepticism of the reliability and validity of 
the clinician’s sensitivity in interpreting Ror- 
schach responses and the three to four hours 
needed to administer, score, and interpret a 
single protocol has motivated many to search 
for a quantitative score and a briefer form. 
The various quantitative and sign scoring sys- 
tems were reviewed by Knopf (3), and their 
ability to differentiate effectively among psy- 
choneurotics, psychopaths, and schizophrenics 
was tested. From his study, Knopf con- 
cluded, “For all practical purposes Rorschach 
summary scores cannot be regarded as ef- 
fective in differentiating among psychiatric 
groups” (3, p. 104). 

Use of group techniques has fared little 
better. As the interpretative hypotheses that 
form the backbone of the Rorschach tech- 
nique were developed on the basis of an indi- 
vidual test situation, the applicability of these 
hypotheses to responses obtained in groups 
has been questioned (2). As a psychometric 
technique, group methods must rely for their 
interpretation upon statistically established 
differentiation among groups. Because of the 
difficulty in group testing of psychiatric pa- 
tients, especially schizophrenics, group norms 
for differential diagnosis are hard to obtain. 

The Rorschach Concept Evaluation Tech- 
nique (CET), though originally developed by 
McReynolds (4) to study perceptual differ- 
ences and psychopathology, permits the cal- 
culation of the quantitative scores sought by 
those who have eagerly grasped at the sign 
approach. Ease in administering the CET and 


1 This study was supported in part by a research 
grant from the National Institute of Mental Health 
of the National Institutes of Health, United States 
Public Health Service. 
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its brevity, about 15 minutes, also offer the 
possibility of its use as a screening technique 
in a psychiatric setting and yet it can be in- 
dividually administered. Essentially, the CET 
involves the presentation to the patient of 50 
specified Rorschach blot areas, each paired 
with a stated concept. The patient is then 
asked to decide whether the indicated areas 
could or could not represent the stated con- 
cepts. The patient is requested to answer Yes 
or No to each presentation, and his score is 
determined on the basis of his answers. The 
test and its development are described in de- 
tail elsewhere (5). In his article, McReynolds 
(5) elaborated on the clinical use of the CET, 
but made no attempt to evaluate its effec- 
tiveness as a diagnostic instrument for indi- 
vidual patients. McReynolds felt at that time 
that the CET could be best used as an ad- 
junct to the traditional Rorschach technique. 
While there is no question that the CET 
plus the Rorschach will give the most infor- 
mation, the purpose of this study is to in- 
vestigate the usefulness of the CET inde- 
pendent of the Rorschach. Specifically, how 
effective is the CET in differentiating among 
hospitalized psychoneurotics, sociopathic per- 
sonality disorders, and schizophrenics? In ad- 
dition, this study contributes data supporting 
the group norms reported by McReynolds. 


Procedure 


The patients in this study were all hos- 
pitalized at Malcolm Bliss Hospital, a city 
diagnostic and acute treatment hospital, and 
carefully examined by four psychiatrists * for 


2 The psychiatrists were George A. Ulett, Eli Rob- 
ins, Kathleen Smith and Neil McCullough. 


Laverne C. Johnson 


Table 1 
Mean Age and Mean IQ of Present Sample 


Group N 


Paranoid schizophrenic 24 
Schizophrenic other than paranoid 29 
Acute schizophrenic 21 
Chronic schizophrenic 32 
Neurotics 34 
Sociopathic personality disturbance 27 


diagnostic discrimination in a larger study of 
schizophrenia (6). Three of the four psy- 
chiatrists, using stringent written criteria, had 
to agree as to the diagnosis before the pa- 
tient was accepted for the project. Testing 
was done after the patient was accepted and 
before treatment was begun, usually within 
one week after admission. The number of pa- 
tients in each category and the mean ages 
and IQs are listed in Table 1. The IQ is based 
upon four subtests of the Wechsler-Bellevue, 
Form I: Information, Comprehension, Simi- 
larities and Vocabulary. There was no sig- 
nificant difference among the groups com- 
pared with respect to age or IQ. The mean 
ages of the sample reported by McReynolds 
were paranoid schizophrenics, 35.0, schizo- 
phrenics other than paranoid, 34.16, and neu- 
rotics, 29.16. These ages are not significantly 
different from those of the present study. Mc- 
Reynolds listed mean education levels and 
did not report mean IQ scores for his groups. 

The CET was administered as part of the 
research battery and in most cases the regular 
Rorschach was administered before the CET. 
However, in some instances, due to pressure 
of time, the patient was told in the usual in- 
structions to look through the cards, but the 
responses were not recorded nor was there an 
inquiry. Comparison of the two approaches 
revealed no differences in CET scores. 

The scores used in this study are those re- 
tained by McReynolds for clinical use (5). 
These scores are J, derived from number of 
Yes responses or agreements with the stated 
concept; V, a measure of consistency of judg- 
ment which reflects the consistency with which 
the patient applies his own standards of 


36.79 
38.8 

30.31 
33.96 
31.22 
28.35 


8.10 
7.16 
8.21 
13.22 
8.67 
5.76 


105.63 

96.81 
102.00 
101.34 
103.76 
103.50 


15.00 
14.16 
17.65 
13.22 
13.59 
12.52 


evaluation (as indicated by J) to the vari- 
ous items in terms of the item criteria based 
upon a normal sample; E, derived from the 
relationship of J and V; and R, number of 
Rorschach responses, if Rorschach was ad- 
ministered before CET. The T score conver- 
sion table reported in the original study was 
used in this study. 


Results 


In Table 2 are listed the CET scores for 
McReynold’s study and for the present study. 
As the Rorschach was not administered to all 
patients, the N for R, number of Rorschach 
responses, is smaller than that for scores de- 
rived from the CET alone. Comparison of 
group means in Table 2 reveals a high degree 
of similarity between the present data and 
those reported by McReynolds. Inspection of 
Table 2 also suggests, and ¢ tests confirm, 
that there are no differences between the 
psychoneurotic and the sociopathic personality 
trait disorder groups on any of the CET 
scores. These two groups were then combined 
and referred to as the nonschizophrenic pa- 
tients for comparison with schizophrenic pa- 
tients. The schizophrenic group was cate- 
gorized as to diagnosis, paranoid and non- 
paranoid, and duration of illness. Patients 
whose symptoms were of recent onset, 12 
months or less, were called acute. Those ill 
for more than 12 months were listed as 
chronic. In Table 3 are the ¢ values for the 
J, V, R, and E CET scores with various 
diagnostic groupings. Values of ¢ for J and R 
scores do not reach significance for any of 
the comparisons. The results indicate that 
neither the number of concepts agreed to on 
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Table 2 
Comparison of McReynold’s Normative Data Mean Scores on the CET, with Present Sample 


Concept evaluation test score 


4 


R 


Group I M 


SD 


McReynolds 
Paranoid 
schizophrenics 


Schizophrenics 
other than 
paranoid 


Neurotics 


Johnson 


Paranoid 
schizophrenics 


Schizophrenics 
other than 
paranoid 


Neurotics 
Sociopathic 
personality 
disturbance 


Acute 

schizophrenic 41.42 
Chronic 

schizophrenic 42.75 15.99 


24 51.71 


8.04 21 45.48 27 «51.88 


10.42 12 45.58 9.86 21 44.05 9.74 


10.46 24 41.50 13.37 32 48.81 13.00 


the CET nor the number of responses given 
on the Rorschach can be used to differentiate 
among the groups compared. It was expected 
that the paranoid schizophrenic would agree 


to fewer concepts on the CET than the other 
schizophrenic and nonschizophrenic patients. 
This difference was not statistically signifi- 
cant (p= .07). 


Table 3 
Comparison of CET Scores—t Values 


Groups compared 


Concept evaluation test score 


V E 


Total schizophrenic—Total nonschizophrenic 

Paranoid schizophrenic—Nonschizophrenic 

Schizophrenic other than paranoid—Nonschizophrenic 

Paranoid schizophrenic—Schizophrenic other than 
paranoid 

Acute schizophrenic—Chronic schizophrenic 

Acute schizophrenic—Nonschizophrenic 

Chronic schizophrenic—Nonschizophrenic 


4.79*** 2.68** 
2.69** .66 
3.53" 


2.34* 2.52°° 
267°° 1.41 
$.35°° ‘ 
2.77” d 1.66 


= .0S level. 
* > = .01 level. 
> = .001 level. 
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po 31 42.72 19.15 28 45.61 9.03 31 45.84 12.30 28 49.78 11.48 
: P| 34 48.21 21.95 28 37.50 13.27 34 43.82 9.72 28 48.50 14.94 
Pt 34 46.65 9.88 34 50.82 7.64 34 44.06 7.88 34 53.18 8.10 
| 
P| 24 37.50 15.19 24 43.00 8.35 17 45.65 12.74 (856 
po 29 45.34 13.56 29 36.41 10.80 21 40.71 12.92 29 43.86 11.61 
a 34 42.70 12.78 34 48.73 8.49 27 44.11 10.28 34 54.54 8.97 
27 43.15 10.36 27 48.52 832 
34.15 
32 4250 
J 

1.54 

71 

1.83 

28 

42 

10 
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Consistency of judgment score, V, is the 
only CET measure that is significantly dif- 
ferent for all the groups compared. This 
difference for schizophrenics and nonschizo- 
phrenics is highly significant. If a V cutting 
score of 40 or below is used, only 9 (14.7%) 
of the nonschizophrenic patients receive scores 
below 40, while 31 (65%) of the schizo- 
phrenic groups receive scores below 40. 
Most of the schizophrenics who receive V 
scores above 40 are in the chronic group. 
Eighteen (56.2%) of the chronic group re- 
ceive V scores above 40 while only 4 (19%) 
of the acute cases receive V scores in the nor- 
mal range. Thus, the V score seems to be a 
fairly sensitive indicator of an inconsistency 
of judgment which is seldom seen in non- 
schizophrenics, and less often in chronic 
schizophrenics, but which is fairly charac- 
teristic of the acute schizophrenic patient. 

McReynolds (5) speculated that the vari- 
ability of standards implied by a low V score 
may be a function of one of two factors. 
First, the low V may indicate a fluctuating, 
unpredictable set of standards. Second, the 
low V may indicate that the patient’s stand- 
ards of evaluation are partly a function of 
the meaning to him of the stimuli. For ex- 
ample, a patient might tend to say No when 
the concept presented referred to a person- 
ality area against which he had a necd to 
defend. While one or both of the above may 
be true in some patients, behaviorally V 
seems to be most related to the effectiveness 
of the patient’s controls at the time of test- 
ing. Also V may be used to evaluate changes 
induced by therapy.* 

E, J minus V, was the least defined of the 
clinical scores by McReynolds. It was tenta- 
tively conceived as a measure of rigidity, but 
Block’s work (1) offered no strong support 
for this view. The results of this study also 
do not offer support for EZ as an important 
variable. While the difference between schizo- 
phrenics and nonschizophrenics was signifi- 
cant, if the cutting score of 40 or below is 

3 Twelve patients on whom records were secured 
six months after admission showed a significant in- 
crease in mean V scores, 36.8 to 48.7. Four patients 
initially had V scores above 40, while three had V 
scores of 40 or below at six months. These three 


were the only patients still in the hospital at time 
of retest. 
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used again 38 (71.3%) of the schizophrenic 
group receive “normal scores.” However, only 
three (4.8%) nonschizophrenic patients re- 
ceived an E score below 40. Again like V, the 
E score seems to be highly diagnostic if be- 
low 40 but not too helpful if above 40. No 
particular pattern of low E scores seemed to 
be present except that patients who agree to 
a large number of concepts have loose stand- 
ards of judgment and are more likely to have 
a low V and thus a low E. This seems to be 
a consistent pattern which has been referred 
to above as indicative of inadequate controls. 
Thus the E score adds little to the J or V 
score interpretations. 

While no systematic study of mental de- 
fectives and organics was made because of 
the small number of cases seen (NV = 12), 
the scores from these groups, especially the 
V score, were similar to the schizophrenic 
group. Chronic brain syndromes (paresis and 
alcoholism) and patients with IQ below 60 
tended to have V scores below 40. These 
findings agree with those reported by Mc- 
Reynolds (5). 


Summary and Conclusions 


The Rorschach Concept Evaluation test 
has been investigated as a technique for ob- 
taining a quantitative score that might be 
useful in diagnosis and as a brief screening 
technique which could be individually ad- 
ministered. In the present study, data of 63 
carefully diagnosed schizophrenics and 34 
neurotics were consistent with the original 
norms reported by McReynolds for schizo- 
phrenics and neurotics. The effectiveness of 
the CET as a diagnostic tool independent of 


‘the Rorschach was evaluated by comparing 


the clinical scores J, V, E, and R received 
by schizophrenics, neurotics, and sociopathic 
personality disorders. No difference between 
the neurotics and the sociopathic group was 
found. Only the V score significantly differ- 
entiated between the schizophrenic and non- 
schizophrenic groups. Only 15% of the non- 
schizophrenic group received scores indicating 
marked inconsistency of judgment, while 65% 
of the schizophrenic group received scores 
indicative of inconsistent standards. Acute 
schizophrenics had lower V_ scores than 
chronic schizophrenics and the V scores of 
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paranoid schizophrenics were not as low as 
other schizophrenics. The V score is felt to 
be useful in diagnosis, but its use as an indi- 
cator of the patient’s control at the time of 
testing regardless of diagnosis may be more 
important. The E score added little informa- 
tion beyond that given by the J and V scores. 
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Movement Responses and Creativity’ 


Dorothy Park Griffin 
North Carolina State Board of Public Welfare 


This study deals with the tendency of hu- 
man beings to perceive movement in suitable 
stimuli presented pictorially in a static visual 
field. The well-known M response of the Ror- 
schach technique is the typical example of 
such a movement response. The behavioral 
correlates of M, according to Rorschach, are 
“more individualized intelligence, greater cre- 
ative ability, more ‘inner life,’ stable affec- 
tive reactions, less adaptable to reality, more 
intensive than extensive rapport” (5). The 
purpose of this study is to investigate the re- 
lationship between movement responses and 
these behavioral correlates, particularly crea- 
tivity. 

Beck, Hertz, also Klopfer and Kelley have 
affirmed agreement with Rorschach’s state- 
ment of these relationships. Schachtel (7) is 
somewhat more cautious, stating that M re- 
sponses “represent a factor in the capacity for 
creative experience.” However, Anne Roe (4), 
in her research on eminent painters, found no 
clear relationship between M and creativity. 
In fact, she found a relatively low number of 
M responses. Rust (6) found a slight but sig- 
nificant negative correlation between creativ- 
ity and the movement response as elicited by 
the Levy Movement Blots. Zubin, referring 
to seven studies of creative ability conducted 
in the laboratory of the New York Psychi- 
atric Institute on creative versus noncreative 
writers, mathematical statisticians, and high 
school students, states that “all have failed 
to reveal any differences on Rorschach per- 
formance, and even tests especially designed 
to elicit movement have failed” (9). 


Procedure 


Using the Levy Movement Blots, a tech- 
nique devised by David Levy, with the ex- 


1 This research was done while the author was at 
Meredith College, Raleigh, North Carolina. 


press purpose of eliciting movement, two ex- 
periments were conducted with students in a 
women’s liberal arts college to investigate the 
relationship between movement responses and 
creativity. 

Selected for the first study were 20 college 
women rated as highly creative by at least 
one teacher and two students and 20 others, 
matched in so far as possible in age, sex, year 
in college, and intelligence (measured by 
ACE scores), who were rated by at least one 
teacher and two students as noncreative. An 
effort was made to secure a sampling of cre- 
ative students from all major departments. 
However, the majority of the group judged as 
highly creative turned out to be majors in 
art, English, and home economics, while the 
noncreative group included no art or home 
economics majors and only two English ma- 
jors. 

No formal rating scale of creativity was 
used, but heads of major departments were 
asked to select students judged to be genu- 
inely creative, according to their own stand- 
ards and also in the light of Murphy’s defi- 
nition (3). Then they were asked to name 
several more whom they judged to be aver- 
age in general ability but lacking in crea- 
tivity. Student raters were given the same in- 
structions. 

The mean age of the creative group was 
21.5 years and of the noncreative 21.2, range 
20-22 years. Sixteen of each group of 20 were 
seniors in college and four were juniors. The 
mean ACE score for the creatives was 99.75; 
range 58-134. The mean ACE score for the 
noncreatives was 98.5; range 63-141. 

Since several other women judged as highly 
creative could not be included in the care- 
fully equated group because of highly su- 
perior ACE scores, it was decided to run a 
second experiment in which the sample would 
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be composed of the first two groups plus five 
additional highly creative subjects with su- 
perior ACE scores and five additional stu- 
dents judged as noncreative, matched ap- 
proximately for age and year in college but 
not for intelligence, since such matches 
seemed to be unavailable. In this way it was 
thought that something might be learned of 
the extent to which intelligence, as well as 
creativity, plays a part in the movement re- 
sponse. The mean ACE score for the five 
additional creatives was 132.6 as compared 
with 91.6 for the five additional noncreatives. 
The Levy Movement Blots were adminis- 
tered individually to each of the 50 women 
in the two studies, according to the directions 
in the Zubin Manual (8) and scored on the 
basis of 21 scales, including Levy’s three- 
step evaluation scale, the Zubin scales, and 
three scales devised by the author to measure 
Part B of the Levy, called the “phantasy 
test.” The examiners were seven psychology 
majors trained by the author. Each of the 
examiners first took the test herself and then 
spent several hours in group sessions with the 
author, scoring responses and discussing scor- 
ing problems. Later sessions were devoted to 
the recording and group scoring of a demon- 
stration test administered by the author. No 
statistical analysis has been made for ex- 
aminer errors, but the group had uniform 
training with ample opportunity for discus- 
sion of problems and for checking with other 
examiners and with the author on doubtful 
points. Further to minimize examiner errors, 
such as the halo effect, an effort was made 
to divide, as equally as possible, the number 
of creative and noncreative students tested 
by each examiner. Complete protocols of each 
subject were obtained for Parts A and B of 
the test, typed and scored by the student ex- 
aminer, and spot-checked by the author. 


Results 


The mean scores and differences, the stand- 
ard deviations and critical ratios on the 21 
movement scales were computed for the crea- 
tive and noncreative groups in the two stud- 
ies. Only one scale, “Control of Movement,” 
showed a difference significant beyond the five 
per cent level of confidence. On both studies 
this can best be interpreted as entirely nega- 
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tive. The method of “rank scores” as well as 
the ¢ test was used to measure the compari- 
sons. The two tests yielded the same results, 
even the probability levels being very close. 
The data were also analyzed by the chi-square 
method, which gave substantially the same 
results. 

Further to test the relationship of intelli- 
gence to differences on the movement scales 
between creative and noncreative groups, the 
ACE scores of each group of 25 were ranked 
and divided into high, medium, and low in- 
telligence subgroups. The range of the crea- 
tive subgroups was: high, 119-162; medium, 
91-115; low, 58-89; and of the noncreative 
subgroups: high, 110-141; medium, 86-110; 
low, 63-84. The means of each subgroup of 
creatives and noncreatives on the Control of 
Movement scale, on which a significant nega- 
tive difference had been obtained, were com- 
pared. The difference in each case was in 
favor of the noncreative group, i.e., negative. 

While the difference in ACE level was defi- 
nitely greatest in the high group of the crea- 
tives, the significant differences between crea- 
tives and noncreatives, on this scale, is found 
not at this level but between the medium in- 
telligence groups, where the ACE scores vary 
only a few points both in means and range. 
Thus it would seem that the significant dif- 
ference is quite independent of the intelli- 
gence factor. 

A comparison of the mean scores and dif- 
ferences of the group of 25 creatives on the 
21 scales was also made with a norm group 
of 100 women from the same college, se- 
lected at random from the various college 
classes and given the test during the follow- 
ing year by ten psychology majors, trained 
by the author, as previously described. No 
significant differences were found, but the di- 
rection of difference was negative as in the 
other studies. 


Discussion 

In general, these negative findings agree 
with those of Roe (4), Rust (6), and Zubin 
(9). The weight of experimental evidence 
seems to be definitely against the classic Ror- 
schach interpretation of M as signifying 
“more individualized intelligence” and greater 
creative ability. It becomes increasingly evi- 
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dent, as Burchard states, “that we cannot 
reach the secret of creativity by counting 
M’s” (1). In fact, Rorschach himself re- 
garded his work as experimental and sug- 
gested that his conclusions be regarded more 
as observations than as theoretical deduc- 
tions, since, as he wrote in 1911, “the theo- 
retical foundation for the experiment is for 
the most part still quite incomplete,” a state- 
ment which could well describe the situation 
today (2). 


Received May 15, 1957. 


eferences 
1. Burchard, E. M. L. The use of projective tech- 
niques in the analysis of creativity. J. proj. 
Tech., 1952, 16, 412-427. 


Dorothy Park Griffin 


2. Klopfer, B., Ainsworth, Mary D., Klopfer, Walter 
G., & Holt, Robert R. Developments in the 
Rorschach technique. Vol. I. Yonkers, N. Y.: 
World Book, 1954. 

. Murphy, G. Personality. New York: Harper, 
1947. 

. Roe, A. Artists and their work. J. Pers., 1946, 15, 
1-40. 

. Rorschach, H. Psychodiagnostics. 
Grune & Stratton, 1942. 

. Rust, R. M. Some correlates of the movement 
response. J. Pers., 1948, 4, 369-401. 

. Schachtel, E. G. Projection and its relation to 
character attitudes and creativity in the kines- 
thetic response. Psychiatry, 1950, 13, 69-100. 

. Zubin, J., & Young, K. M. Manual of projective 
and cognate techniques. Madison, Wis.: Col- 
lege Typing Co., 1948. 

. Zubin, J. Failures of the Rorschach technique. 
J. proj. Tech., 1954, 18, 303-315. 


New York: 


Journal of Consulting Psychology 
Vol. 2, 1958" 


Approaches to Reliability of Projective Tests with 
Special Reference to the Blacky Pictures Test*’ 


Samuel Granick and Norma A. Scheflen 
St. Christopher's Hospital, Philadelphia, Penna. 


Reliability data are frequently sparse or ab- 
sent for projective tests. Twenty tests are re- 
viewed in Anderson and Anderson (2), but 
significant information on reliability. is pre- 
sented for only four tests, preliminary findings 
for five, and no information for the remainder. 
Even where reliability has been reported in 
detail, incompleteness and uncertainty remain. 

Reviews by Macfarlane and Tuddenham 
(6) and by Ainsworth (1) point to a variety 
of difficulties in treating projective test data 
by statistical methods. One difficulty is the 
development of objective scores which reflect 
the integrity or Gestalt of the personality. 
Without such scores subjective judgment must 
be resorted to, which may confound test reli- 
ability with judge reliability. 

A more important problem, perhaps, is the 
relationship between test reliability and the 
stability of the personality. Since behavior is 
not usually consistent, marked differences in 
responses might be expected on repeated ad- 
ministrations of a test. The personality is as- 
sumed, however, to have a basic organization 
which should be reflected in the test responses. 
Reliability, therefore, may be expected when 
the clinical features are used as a basis for 
measurement. 

The various attempts to study reliability 


1This study was supported in part by a grant 
from the United States Public Health Service, Insti- 
tute of Mental Health. The authors gratefully ac- 
knowledge the critical and helpful assistance of the 
psychology staff of St. Christopher’s Hospital, Phila- 
delphia, Penna. Sincere thanks also go to the Board 
of Education, the Hunter, the Miller, and the Mc- 
Kinley Schools of Philadelphia for helpful coopera- 
tion. 

2 Presented at the Eastern Psychological Associa- 
tion meeting at New York, April 12, 1957. 


of projective tests seem to direct attention to 
these problems rather than to solve them. 
Since validity implies reliability, the sugges- 
tion has been made to emphasize the validity 
of the test, thus, in effect, bypassing the prob- 
lem (6). Another suggestion is to determine 
the limits within which the test is reliable 
by delineating the aspects of the test data 
which are unreliable because of such factors 
as the testing environment and the mental 
set of the subject (1). 

The present authors feel that a direct at- 
tack on the problem of reliability is possible. 
It is suggested that aspects of reliability which 
involve clinical features of the test be sought. 
The demonstration that many of these ele- 
ments are reliable would make it reasonable 
to place confidence in the test’s stability, at 
least, within a clinical framework. 

The above approach is illustrated in a se- 
ries of reliability studies conducted as part of 
a more general research on the usefulness of 
the Blacky Pictures Test for children. It was 
found possible to make use of three traditional 
measures of or approaches to reliability. These 
were (a) judgment reliability or the extent to 
which the scorers agree, (6) temporal reli- 
ability or the extent to which the test-retest 
scores are consistent, and (c) internal reli- 
ability or the extent to which split halves are 
consistent. 


Subjects 


The Ss were grade school pupils, 28 girls 
and 12 boys, whose ages ranged from six 
years, six months to eleven years, three 
months. Their IQs on the Revised Stanford 
Binet, Form L, ranged from 81 to 127. All Ss 
had been evaluated as being well-adjusted by 
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Table 1 


Percentage Agreement of Judges in Rating Spontaneous 
Story (Strong or Weak) of Each Card 


Card 
No. 


* Card VI was scored twice, A for “castration anxiety,” B 
for “‘penis envy.” 


means of teacher ratings and clinical judg- 
ments prior to the administration of the 
Blacky Pictures Test. All testings, including 
the administration of the Blacky, were on an 
individual basis. Twenty Ss (Group A) had 
a second administration of the Blacky three 
months to one year later. The other 20 Ss 
(Group B) were selected from a larger group * 
and were individually matched to Group A 
for age, sex, IQ, and socioeconomic level. The 
mean age for Group A was nine years, five 
months and for Group B, nine years, six 
months. The mean IQ for Group A was 106.8 
and for Group B, 107.5. The difference be- 
tween the means is not significant. 


Judgment Reliability 


Reliability of scores is basic to the use 
of any test. With projectives, there has fre- 
quently been the problem of devising a scor- 
ing system which abstracts the data in a clini- 
cally meaningful and reliable fashion. Since 
totally objective scoring generally proves diffi- 
cult or impossible to devise, reliability of scor- 
ing, in effect, is dependent on the reliability 

8 This larger group consisted of 88 Ss, selected from 
the Philadelphia public schools, to be used in a se- 
ries of studies of the Blacky Pictures Test. Each 
child was considered well-adjusted as rated by his 
teacher and was further judged well-adjusted by a 


clinical psychologist before the Blacky was adminis- 
tered. 


of judgments. In the case of the Blacky, Blum 
(3) suggests a rating of “strong” or “weak” 
as one type of score for the emotions elicited 
by each of the pictures or personality dimen- 
sions alleged to be tapped by the test. Ac- 
cordingly, the following hypothesis suggests 
itself. Judges agree on the scoring of the spon- 
taneous stories of the Blacky Pictures Test. 
Using the criteria for scoring “strong” or 
“weak” proposed by Blum, with modifications 
by Hilgeman (5), Winters (7), and the 
writers, each of 10 judges rated the stories of 
two cards for the 40 Ss of Groups A and B. 
The validity or clinical significance of scoring 
was not in question. Merely considered was 
the extent to which the test stories could be 
scored by this system in a consistent fashion. 
Table 1 shows the percentage of agreement 
of the judges for each card. That agreement 
is better than chance is supported at less than 
the .05 level of confidence for all but two of 
the cards, the majority of » values being .01 
or less. It should be noted, however, that only 
three of the eleven groups of judgments show 
agreement of 90 per cent or over. The results 
suggest that consistency of scoring may be 
achieved but that improved concreteness of 
scoring criteria needs to be developed. 


Temporal Reliability 


As suggested earlier, repeated administra- 
tions of a projective test to the same S over 
a period of time are not expected to produce 
the same responses, but they should reflect 
the stable components of the personality. De- 
spite the undefined nature of the components, 


Table 2 


Percentage of Correct Matching Between Two 
Test Administrations to Same Ss 


= 


Judge % Matching p Value* 


* This value was determined by a direct application of proba- 
bility theory as suggested by Feller (4, pp. 62-66). 
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(N = 40) 
% p Value 
Agreement (Binomial) 
01 
34 
001 
O1 
001 
001 
001 
05 
21 
001 
001 
100 001 
100 001 
43 07 
14 37 
100 001 
72 001 
72 
57 02 


Reliability of Projective Tests 


Table 3 


Judge Agreement in Rating Similarity of Thematic 
Content for Two Test Administrations to 
Same Ss and for Single Administra- 
tions to Matched Ss (V = 40) 


Card % 


p Value 
No. Agreement 


(Binomial) 


70 02 
73 
68 05 
60 21 
78 001 
75 01 
65 06 
75 O1 
55 52 
88 001 
83 001 


reflections or indirect indicators may be used 
to evaluate the reliability of a test. In the 
present study, three hypotheses were tested 
as an approach to the clarification of this 
principle. 

Hvpothesis 1. Blacky Test records obtained 
at diferent times with the same Ss can be 
metched. The Blacky protocols of the 20 Ss 
of Group A who had been tested twice over a 
period of three to twelve months were di- 
vided into three sets, one of six boys and two 
of seven girls. Two judges matched the set of 
six, and three judges matched each of the sets 
of seven. Matching was done on a gross basis, 
permitting the utilization of a variety of clues 
such as language, story content, and emo- 
tional patterns. The judges were informed, 
furthermore, of the sex of the Ss. 

All matchings but one were beyond chance 
expectation as shown in Table 2. Perfect 
matching was obtained in only three of the 
seven matchings. It appears, however, that 
there was sufficient temporal consistency in 
the Ss’ responses generally to enable five of 
the eight judges to recognize which records 
belonged together. 

Hypothesis 2. Thematic content of spon- 
taneous responses to each of the Blacky Pic- 
tures is more frequently similar for the same 
Ss on repeated test administrations than for 
a group of Ss and their matched controls. The 
spontaneous stories from the two sets of 
Blacky protocols of Group A and those ob- 
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tained from Group B were used to test this 
hypothesis. Two judges evaluated the the- 
matic content of the three spontaneous stories 
presented for each of the 11 Blacky cards. 
Judgments were then made as to whether the 
stories or responses were similar to or differ- 
ent from each other. For the guidance of the 
judges, it was suggested that they consider the 
manifest rather than the implied content in 
determining whether any two stories were 
similar to each other. 

Agreement between judges is presented in 
Table 3. There would seem to be only fair 
consistency between the judges, but yet the 
overall pattern is one of greater than chance 
agreement. To test the hypothesis, a paired 
t was computed for the differences between 
the number of stories judged similar (¢ equals 
7.25). The hypothesis is supported at the .001 
level of confidence. 

The finding points to the overall trend in 
the direction of significantly greater similar- 
ity of thematic productions between the two 
administrations of the test to Group A than 
between the single administrations to Groups 
A and B. Despite the much better than chance 


Table 4 


Percentage of Similarity of Thematic Content Be- 
tween Two Test Administrations to Same 
Ss and Between Single Adminis- 
trations to Matched Ss 


Matched Ss 


Ss Pair No. 


32 
23 
36 
32 
41 
50 
18 
27 
36 
41 
27 
59 
50 
41 

5 
27 
32 
41 
41 
18 


RSUSSSSE 


= 
Same Ss 

55 
50 
77 
86 
55 
45 
36 
82 
590 

10 77 

77 

12 73 

13 

14 

15 

16 

17 

18 

19 

20 

bad 
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Table 5 


Consistency of Likes and Dislikes of Particular Cards 
for Two Administrations to Same Ss 
(N = 20) 


Card 
No. 


Same 
pref. 


p Value 
(Binomial) 


I 15 001 
I 15 001 
Ill 14 001 
IV 14 001 
Vv 17 001 
VI 19 001 
Vil 15 001 
Vill 14 001 
Ix 16 001 
x 14 O01 
XI 17 001 


result, it should be noted that the amount of 
consistency in thematic productions by the 
individual S is not very high, as shown in 
Table 4. 

Hypothesis 3. Ss are consistent in their 
likes and dislikes for individual cards over a 
period of time. The preferences or “like” and 
“dislike” choices for the two sets of Blacky 
protocols of Group A were compared. Table 5 
shows the extent to which each S placed the 
individual cards in the same category for the 
two administrations of the test. Marked con- 
sistency is evident in the choices, supporting 
the hypothesis at the .001 level of confidence 
for each card. 


Internal Reliability 


The writers recognize that a split-half ap- 
proach to reliability tends to destroy the 
global quality of a projective test. Neverthe- 
less, it seems feasible to consider the data in 
terms of specific, clinically meaningful dimen- 
sions which can be evaluated through a sam- 
pling of the test responses of the individual 
Ss. Such an approach serves to maintain the 
overall unity or Gestalt of the test record and 
yet permits its division into two approximately 
similar halves. Along these lines the writers 
attempted to study the test’s reliability with 
regard to (a) verbal fluency as measured by 
the number of words in the spontaneous 
stories, and (5) organization as reflected 
through the use of the central theme of each 
picture. 


Hypothesis 1. The verbal fluency of Ss is 
consistent within the Blacky Test. 

The number of words for each spontaneous 
story of the 40 Ss (the first test administra- 
tion of Group A and the records of Group B) 
was counted. The total of each S on Cards 2, 
4, 6, 8, and 10 was compared with his total 
for Cards 3, 5, 7, 9, and 11, Card 1 being 
omitted because of the uncertain effect of its 
unique position in the series. The rank-order 
correlation between the two halves was .92, 
which is significant at the .01 level of con- 
fidence. The similarity of fluency between the 
two halves of the test is also reflected in the 
means and standard deviations (mean of 2-10 
equals 184 words and SD equals 107; mean 
of 3-11 equals 178 words and SD equals 129; 
t equals 0.64). 

Hypothesis 2. In organizing their spontane- 
ous stories, the central theme of each card 
will be used in a consistent fashion by the Ss. 

The same division of the test as described 
above was used. Two judges scored each story 
of the 40 records as to whether it was “struc- 
tured” (utilized the central theme of the pic- 
ture) or “unstructured” (failed to use the 
central theme of the picture). Agreement be- 
tween the judges was 100 per cent. The phi 
coefficient of correlation between the two 
halves of the test was found to be .67, which 
is significant as the .01 level of confidence. 


Discussion and Conclusions 


The results reported do not represent high 
level reliabilities, but since they are uniformly 
in the same direction they command some re- 
spect. Accordingly, the Blacky Pictures Test 
would seem to have a significant degree of re- 
liability when used with a group of children.* 
Of perhaps greater interest, however, are the 
implications derived from the approaches used 
for the study of reliability. The concept of a 


* Since there is a five-year age span in the group, 
and test responses are expected to differ with age, 
the question arises as to the effect of the age factor 
on the reliability results obtained. The small sample 
and design of the study, unfortunately, did not per- 
mit an adequate control for age. Inspection of the 
data, however, reveals no marked differences in con- 
sistency of response patterns between the older and 
the younger Ss. This, of course, is not a definitive 
answer to the problem. Further study of the age 
factor is in order. 


Reliability of Projective Tests 


continuing study of a test’s reliability is sug- 
gested. What seems required is that many as- 
pects of the test should be investigated over 
a period of time and under varying circum- 
stances. In this fashion, there is a gradual ac- 
cumulation of data clarifying the nature of 
the test’s stability. Since an overall “coeffi- 
cient of reliability” for a projective test does 
not seem to be feasible, this approach might 
represent a reasonable alternative. 

From a clinical standpoint, there are ad- 
vantages to such an approach. Data are pro- 
vided which throw light on the significance of 
test results obtained under varied circum- 
stances. Thus, the objectivity of interpreta- 
tions and judgments is improved. Since hy- 
potheses, such as those tested in this study, 
tend to be related to qualities of the test gen- 
erally given attention in the clinical situation, 
the clinician can more easily differentiate the 
stable aspects of the test performances from 
those which are a function of the immediate 
situation. 

In all, the present study attempts to sug- 
gest that the traditional concepts of reliability 
need not be abandoned in dealing with pro- 
jective materials. Some broadening of the in- 
terpretation of the concepts is called for. Also 
necessary is the adaptation of the test data to 
available statistical or analytical techniques, 
but with care being exercised to preserve the 
clinical entity of the responses being studied. 


Summary 


In this study, the feasibility of developing 
reliability measures of projective tests based 
on the clinical aspects of the test material is 
cohsidered. Using data obtained with the 
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Blacky Pictures Test on 40 school-age chil- 
dren, several hypotheses are explored related 
to judgment, temporal, and split-half reli- 
abilities. 

The data are analyzed with a view to main- 
taining the global or clinical quality of the 
test responses. Evidence is derived which sup- 
ports the test’s stability to a modest degree. 

Attention is given, in addition, to the diffi- 
culties of deriving an over-all “coefficient of 
reliability” for projective measures. This study 
indicates that integration of varied approaches 
to a test’s consistency may serve as an ap- 
propriate alternative. 


Received May 6, 1957. 
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Personality Correlates of Q-L Differentials 
on the ACE*’ 


B. Spilka 
University of Denver 


and Gloria Kimble 


Washburn University 


In 1952, Altus (1) empirically attempted 
to determine correlates of intellectual varia- 
tion in a population of women students. Be- 
cause of the empirical nature of this work, 
the authors felt it desirable to verify and ex- 
plain Altus’ findings further. 

Altus item-analyzed certain MMPI items 
against ACE Q — L differentials. Of the 566 
items available, 43 met his criterion of selec- 
tion, but only 26 attained statistical signifi- 
cance at the .05 level. 

Nine categories of 3 to 8 items were then 
constructed, and descriptions of the subjects 
were based on tendencies to answer these 
items in a certain direction. The validity of 
such item groups and this type of interpretive 
procedure is questionable. 

Recognizing some of these difficulties, Altus 

repeated his study and found that a valida- 
tion correlation between the Q—L scores 
and the items dropped from .63 to .25. How- 
ever, the latter coefficient was significant and 
thus assumed to indicate validity for the 
items. 
In the present study, 87 women students 
at Washburn University served as Ss. Each 
was administered the ACE, the complete 
MMPI, the Thurstone Scale of Religious 
Orthodoxy, the Taylor Manifest Anxiety 
scale, the F scale, the Siegel Manifest Hos- 
tility scale, and measures of Behavioral, Feel- 
ing-Tone, and Somatic manifest anxiety de- 
veloped by Spilka and Siegel. 

The MMPI Mf scale was used to assess 


1The authors would like to express their thanks 
to E. D. Turner for his statistical help. 

2An extended report of this study may be ob- 
tained without charge from Bernard Spilka, Depart- 
ment of Psychology, University of Denver, Denver 
10, Colorado, or for a fee from the American Docu- 
mentation Institute. Order Document No. 5468, re- 
mitting $1.75 for microfilm or $2.50 for photocopies. 


Altus’ categories of sexual inversion and im- 
maturity, and masculine-feminine vocational 
preferences. The Orthodoxy scale, the F scale, 
and the MMPI Pd scale appeared to corre- 
spond closely to Altus’ “religiose” and “rose- 
colored glasses” items. The MMPI Pt scale 
was considered a measure of the Ss’ obsessive- 
compulsive tendencies. 

Tk: MMPI Hs, D, and Hy scales, plus the 
4 scales of manifest anxiety, were used to 
assess Altus’ “anxiety” syndrome. The “so- 
cial sensitivity” category was measured by 
the MMPI Si scale and the Behavioral Anx- 
iety scale. Altus’ “resentful attitudes” cate- 
gory was assessed by the MMPI Pd scale 
and the Manifest Hostility scale. 

Only 6 of Altus’ items now satisfied his ac- 
ceptance criterion and only 2 were signifi- 
cant at the .05 level. Neither of these items 
satisfied the acceptance criterion, and only 
one of them was significant in Altus’ study. 
None of the correlations computed to assess 
the validity of Altus’ categories attained sig- 
nificance. 

It was found that 30 of the 43 items ex- 
hibited scoring tendencies in agreement with 
those obtained by Altus. The correlation be- 
tween the 43 items and the differentials was 
.22, revealing significance at the .05 level, 
and supporting Altus’ “modicum of validity” 
claim. Where it lies and what comprises it re- 
mains unknown. 

It is recommended that further work in 
this area be conducted on a theoretical rather 
than an empirical basis. 

Brief Report. 
Received November 19, 1957. 
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Learning in Aphasic Patients*: 


Leo Katz 


Aphasic disturbances are generally defined 
as impairments of language functions due to 
brain damage, either through cerebro-vascular 
accident or external injury to the brain. The 
extent to which other mental functions are 
impaired in aphasia is a subject of consider- 
able controversy. 

This study attempted to investigate the 
presence of certain learning deficiencies in 
aphasic patients often described by aphasia 
therapists (6, 8, 10). These workers report 
that frequently a patient may be unable to 
repeat a phrase or word he has just heard 
when he is asked for it, but that sometimes, 
when he is not asked for it, or when he is per- 
mitted or encouraged to use roundabout meth- 
ods, he may be able to furnish the material 
previously learned. Similar observations were 
reported by Conrad (1), who studied 24 cases 
of predominantly expressive aphasics who had 
word-finding difficulties. He explained their 
impairment as an inability to structure the 
conceptual field, so that peripheral fragments 
from the background of the Gestalt are of- 
fered by the patient as his best approximation 
to the “final” and inaccessible figure, the word 
itself. To facilitate recall in such patients, he 
advocated that they be encouraged to supply 
whatever peripheral or tangential elements 
are available to them in their efforts to ar- 
rive at the desired solution. To some extent, 
this roundabout method of recapturing lost or 
forgotten material is also used by non-brain- 


1 This paper is based on a dissertation submitted 
in candidacy for the degree of doctor of philosophy 
at Teachers College, Columbia University. The writer 
is indebted to members of his thesis committee, E. J. 
Shoben, Jr., P. E. Ejiserer, Virginia M. Axline, and 
H. Solomon for their suggestions and encouragement. 

2 Presented to the twenty-sixth annual meeting of 
the Eastern Psychological Association, Philadelphia, 
Pa., 1955. 


VA Outpatient Clinic, Brooklyn, N. Y. 
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injured people when trying to remember 
names or dates (3). 

Thus, under certain conditions, making an 
effort or concentrating on a task does not seem 
to result in more efficient performance than is 
obtained in the absence of such self-direction. 
The present investigation was undertaken to 
determine whether aphasics exhibit the usual 
gain in efficiency of recall which results from 
the intention to remember as compared with 
the absence of such an intention. 

Traditionally, learning has been found to 
be better with goal-directed than with inci- 
dental instructions. It seems obvious that a 
person making an effort to learn will perform 
more adequately than one not making such 
an effort. The hypothesis of the present study, 
accordingly, is that goal-directed learning, as 
compared with incider ial learning, is signifi- 
cantly more efficient for nonaphasics than for 
aphasic patients. 


Method 


The material to be learned consisted of two 
series of 20 postage stamps (Series X and Y), 
which were exposed to the subjects (Ss), one 
stamp at a time for five seconds each. In the 
incidental learning situation, one series of 
stamps was exposed with instructions to 
evaluate the stamps aesthetically (“tell me 
whether you like it or not’’). After all of the 
20 stamps had been shown, they were mixed 
with a group of 40 other stamps which the S 
had not been shown and he was asked to pick 
out all the stamps he could remember. The 
learning score was the number of stamps cor- 
rectly selected minus one-half the number of 
stamps incorrectly selected. 

In the goal-directed learning situation, 
which always followed the incidental learn- 
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ing situation by about 24 hours, the Ss were 
shown the other series of postage stamps, one 
at a time for five seconds each, this time with 
instructions to remember them so as to be 
able to select them afterwards from another 
group of 40 stamps. The learning score again 
was calculated in the manner described previ- 
ously. 

The difficulty of the task depended in part 
upon the degree of similarity between the 20 
stamps to be remembered and the 40 stamps 
from which the 20 had to be selected. At- 
tempts were made to equate this degree of 
similarity in both series. In order to deter- 
mine the equivalence of the two series, they 
were administered to a normative group of 18 
Ss with goal-directed instructions only. The 
means and standard deviations of the scores 
obtained under these conditions were essen- 
tially equivalent. The split-half reliabilities of 
the scores of the normative group were r = .85 
for Series X and r= .89 for Series Y, after 
the Spearman-Brown correction. 

Stamp Series X and Y were alternately ad- 
ministered to successive Ss, e.g., S #1 might 
have Series X administered with incidental in- 
structions and Subject #2 would then have 
Series Y administered with incidental instruc- 
tions. Thus, any difference obtained with goal- 
directed and incidental instructions were prob- 
ably not due to differences in difficulty be- 
tween the two stamp series. 


Subjects 


The Ss for this study were selected from 
two Veterans Administration hospitals and 
one clinic which cooperated in this research.* 
Stamp collectors were excluded from the 
study. 

The experimental group consisted of 25 na- 
tive white male patients between the ages of 
19 and 51 who were being treated for aphasia 
and right hemiplegia due to either cerebro- 
vascular accident or traumatic injury to the 
brain. All of the aphasics were being given 
language retraining and had been classified 


8 The writer is indebted to the staffs of the neu- 
rology and psychology services of the Boston VA 
Hospital, the Bronx VA Hospital, and the Mental 
Hygiene Clinic of the VA New York Regional Of- 
fice for their cooperation and assistance in making 
patients available for this study. 
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by the language therapists as predominantly 
expressive. No acutely aphasic patients were 
included in the group. 

' The control group consisted of 25 native, 
white male patients between the ages of 19 
and 48 from the surgical wards of a Veterans 
Administration hospital. Patients with his- 
tories of psychiatric conditions, gastric or 
duodenal ulcer, or essential hypertension were 
excluded. The patients were tested postopera- 
tively and each name was first submitted to 
the physician in charge in order to eliminate 
those with postoperative complications. 

A comparison of the mean age and educa- 
tion of the Ss indicated that the two groups 
were essentially homogeneous with respect to 
age and education. The two groups were found 
also not to differ significantly in sociocultural 
status, as measured by occupational level with 
the Goodenough Anderson Occupational Rat- 
ing Scale (5). 


Results 


The hypothesis states that goal-directed 
learning ss compared with incidental learn- 
ing is ore efficient for nonaphasics than for 
aphasics. To test this hypothesis, the mean 
recall scores for each type of learning were 
compared within each of the two groups. 

For the nonaphasics, the mean recall score 
for incidental learning was 10.66 with a stand- 
ard deviation of 3.88. The mean recall score 
for goal-directed learning was 13.12 with a 
standard deviation of 2.55. The correlation 
between recall scores obtained with incidental 
and goal-directed instructions was .6135. A 
t test applied to the difference between the 
mean scores yielded a ¢ of 3.93, which was 
found to be significant beyond the .001 level. 
These results indicated that nonaphasics per- 
formed significantly better with goal-directed 
than with incidental instruCtions. 

For the aphasics, the mean recall score for 
incidental learning was 9.92 with a standard 
deviation of 3.17. The mean recall score for 
goal-directed learning was 9.06 with a stand- 
ard deviation of 3.24. By inspection of the 
mean scores it is apparent that the aphasics 
did not earn a higher recall score with goal- 
directed instructions. Their mean recall score 
with goal-directed instructions is lower than 
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the mean score earned with incidental instruc- 
tions. These results support the hypothesis. 

In order to determine whether the differ- 
ence between recall scores obtained by the 
aphasics was significant, a ¢ test was applied 
to the difference between the mean scores. 
The correlation between incidental and goal- 
directed scores was .7167. The ¢ of 1.76 which 
resulted from this analysis, indicated that the 
difference was not significant. 

Discussion 

The results of the experiment show that 
goal-directed instructions, as compared with 
incidental instructions, produced higher re- 
call scores in the nonaphasics but not in the 
aphasics. What is the explanation for the lack 
of improvement in the scores of the aphasics 
despite their knowledge of the task expected 
of them? 

One possible explanation for this failure to 
profit from the goal-directed set may be the 
aphasic’s relative inability to retain or main- 
tain the set induced by the goal-directed in- 
structions with sufficient consistency to per- 
form more adequately. Such a notion is con- 
sistent with studies of brain functions in both 
animals and humans which report increased 
distractibility in the performance of brain- 
injured Ss (4, 7, 9). The psychological proc- 
esses responsible for this increased distracti- 
bility in the human Ss, while difficult to 
specify, may be related to deficiencies in the 
use of implicit verbalizations or cue-produc- 
ing responses. Implicit verbal cues tend to be 
employed for “rehearsal” of the instructions 
when the task becomes complicated. In such 
circumstances, one may quickly “run through 
the steps” of a demanded activity to make 
sure one understands the task. The goal-di- 
rected learning task in the present study was 
purposely made simple in order to minimize 
interferences which might result from this 
possible inability to maintain the set induced 
by instructions. 

Another, and more far-reaching explanation 
for the aphasic’s relative inability to profit 
from knowledge of the goal may be related 
to the aphasic’s drive state when confronted 
with the need to solve a problem. Aphasics 
have a high degree of experience with failure. 
Not only are they handicapped with respect 
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to the use of language, they also suffer from 
severe restrictions in the use of their limbs 
(many aphasics are paralyzed on one side of 
the body). Thus, they experience repeated 
failure in ordinary motor situations, in the 
process of communication and in social rela- 
tionships which characteristically depend upon 
speech proficiency. It was thought, therefore, 
that as soon as aphasics are made aware of 
a new expectation of achievement, their his- 
tory of experience with failure might reacti- 
vate anxieties which inhibit their deriving the 
same benefit from a goal-directed learning 
situation as nonaphasics can. 

This notion of the inhibitory effects of anx- 
iety is consistent with Goldstein’s (4) theory 
of the organic patient’s inability to cope with 
a problem as the determinant of his cata- 
strophic reaction. Undoubtedly, the anticipa- 
tion of this inability to cope, or the memory 
of previous failures, can serve to evoke or to 
perpetuate the feelings of anxiety which may 
have interfered with a more efficient perform- 
ance of the aphasics in the goal-directed 
learning situation. 

Either one of the two explanations offered 
can account for the data, but it is also pos- 
sible that both of them are applicable. It is 
possible, for example, that anxiety due to the 
reactivation of previous experiences with fail- 
ure may have made it more difficult for the 
aphasics to retain the set induced by the in- 
structions. 

In the present experiment, it was not pos- 
sible to observe directly the drive state of the 
Ss or even to subject it, indirectly, to some 
measurement procedure. It might be of inter- 
est, however, to note that one of the aphasics 
repeatedly requested that he be permitted to 
terminate the goal-directed learning situation, 
although he had made no such request on the 
previous day in the incidental learning situa- 
tion. . 


Summary 


The study investigated the relative effec- 
tiveness of goal-directed and incidental learn- 
ing in aphasic patients. Observations about 
the performance of aphasics made by differ- 
ent workers were the source of the hypothe- 
sis which stated that in aphasic patients the 
normally more effective goal-directed learning 
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situation is not significantly more effective 
than the incidental learning situation. In or- 
der to test this hypothesis, two equivalent se- 
ries of postage stamps were exposed to 25 
predominantly expressive male aphasic ;a- 
tients between the ages of 19 and 51, once 
with instructions to remember the stamps and 
once with instructions to evaluate the stamps 
aesthetically. Retention of the material was 
tested and the scores obtained by the aphasics 
were compared with those obtained by a con- 
trol group of 25 male nonaphasics. Results 
confirmed the hypothesis. 


Received May 7, 1957. 
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Wechsler-Bellevue Scatter as an Index of Schizophrenia 


Arnold Trehub and Isidor W. Scherer 
VA Hospital, Northampton, Mass. 


Numerous studies have been reported which 
have used some measure of scatter on the 
Wechsler-Bellevue as a basis for clinical diag- 
nosis. The recent review by Guertin, Frank, 
and Rabin (2) summarizes the yield to date 
as “inconclusive.” The criticism has been of- 
fered that instances where good agreement 
has been found between psychiatric and W-B 
diagnoses may be “due to the inspectional 
techniques of the clinician rather than the 
numerical scatter values” (4, p. 229). Fur- 
thermore, it is held that while W-B scatter 
may successfully differentiate groups in terms 
of a statistically significant difference, it has 
not been demonstrated to be effective in the 
diagnosis of individuals. 

The present study dealt with the individual 
discrimination of schizophrenics from neu- 
rotics and character disorders on the basis of 
a scatter index computed directly from W-B 
subtest scores. 


Procedure 


The sample consisted of a random selection 
of male patients in a Veterans Administration 
neuropsychiatric hospital who held a psychi- 
atric diagnosis of schizophrenia, neurosis, or 
character disorder, no had been referred for 
psychological testing and who had been ad- 
ministered a complete (11 subtests) W-B 
Form I. The scatter index used involved sheer 
variability of subtest performance without ref- 
erence to any particular patterns of achieve- 
ment. Weighted scores of all eleven subtests 
on the W-B were averaged for each case, and 
the differences of all subtests from the mean 
were summed disregarding sign. The sum ob- 
tained was the individual’s scatter score. An 
earlier pilot investigation showed that when 
all patients in the sample who had a scatter 


score of 19 or greater were labeled schizo- 
phrenic and all below 19 were labeled neu- 
rotic or character disorder, we were correct in 
approximately 70% of the cases against the 
criterion of psychiatric diagnosis. Accordingly, 
in the present study a sequential analysis 
plan was set up (1) in which the hypothesis 
of 60% correct diagnoses (H,:P, = .60) was 
tested against the alternate hypothesis of 70% 
correct diagnoses (H,:P, = .70). Risks a and 
B were both set at .05. As each case was 
drawn, the diagnostic decision was based upon 
the following rules: 


Scatter index = 19 = schizophrenia. 
Scatter index < 19= neurosis or character 
disorder. 


Each diagnosis made on the basis of the scat- 
ter index was matched against psychiatric 
diagnosis and plotted as correct or incorrect 
in the sequential analysis channel. 


Results and Discussion 


When the 269th case was plotted, the se- 
quential analysis plan yielded the decision to 
reject the hypothesis of 60% correct diag- 
noses in favor of the hypothesis of 70% cor- 
rect diagnoses (a = .05, 8 = .05). 

Computation of mean age at the time of 
testing shows the mean age in years to be 
31.8 for the schizophrenics, and 32.9 for the 
neurotics and character disorders. Mean IQ 
is 103.4 for the schizophrenics and 111.0 for 
the neurotics and character disorders. 

Since the patients used in this study consti- 
tuted a random sample of schizophrenics, 
neurotics, and character disorders who were 
referred for psychological testing, the diag- 
nostic composition of the total sample may 
be considered to approximate the actual base 
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Table 1 
Scatter Index Cutting Scores and Associated Diagnostic Consequences 


Neur. and Charac. Disorders 


Cumula- 
tive 
Fre- Fre- 
quency quency 


ee 


1 

2 

8 
il 
16 
19 
22 
29 
38 
42 
48 
52 
64 
75 
90 
97 


- 


100 
103 


Note.—Base rate for schizophrenics (P) = 61.7%. Base rate for neurotics and character disorders (0) = 38.3%. 


rates of such patients at our hospital. We 
found that schizophrenics composed 61.7% 
of our sample, while neurotics and character 
disorders composed 38.3%. This breakdown, 
of course, is based upon a hospital subpopula- 
tion from which other diagnostic classifica- 
tions such as brain damage, psychotic depres- 
sion, etc. are excluded. 

It has been shown that with the method of 
computing scatter used in this study, and 
with the adoption of a scatter index of 19 as 
a cutting score, schizophrenics could be dif- 
ferentiated from neurotics and character dis- 
orders in more than 60% of the cases. In 
terms of the sequential analysis model, op- 
erating at a level of 70% correct diagnoses is 
significantly more likely than operating at a 
level of 60% correct diagnoses. But, having 
established this, we are still concerned with a 
number of other questions about the diag- 
nostic decisions that might be made on the 
basis of the W-B scatter index. Table 1 pre- 
sents an analysis of diagnostic consequences 
in the sample of patients for a number of 
different scatter index cutting scores. It is 
based upon procedures outlined by Meehl 
and Rosen (3). Following are definitions of 
terms used in the table: 


= Proportion of schizophrenics correctly identi- 
fied by the test (“valid positive” rate). 

f2 =Proportion of neurotics and character disor- 
ders misidentified by the test as schizophrenic 
(“false positive” rate). 

P,, = Percentage of total sample correctly identified 
as schizophrenic. 

Q,,= Percentage of total sample misidentified as 
schizophrenic. 

Hr = Percentage of total correct diagnoses. 

Hp = Percentage of correct diagnoses if a diagnosis 
is made only for patients scoring at or above 
cutting score. 

Re = Percentage of patients in total sample who 
receive a diagnosis if a diagnosis is made only 
for patients scoring at or above cutting score. 


If nothing but base rates were available to 
us for making a diagnostic decision, we would 
call all patients schizophrenic and we would 
be correct in 61.7% of the cases. In this re- 
gard, one might ponder how much an informal 
appreciation of patient base rates might be 
contributing to our diagnostic acumen. 

It can be seen from Table 1 that when our 
original cutting score of 19 is used, we are 
correct in 66.9% of the cases (Hr). We have 
thus shown an improvement of 5.2% over 
base rate diagnosis. However, when we use a 
cutting score of 16, our percentage of total 
correct diagnoses rises to 72.1%, an improve- 
ment of 10.4% over base rate diagnosis. If 
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Scatter Cumula- 

Index tive 

Cutting Fre- Fre- 

Score quency quency pi p: Pp Ops Hr He Re 

> 28 20 20 .1205 .0097 74 A 45.3 94.9 78 
27 6 26 .1566 0194 96 Ps 47.2 93.2 10.3 
26 10 3% .2169 0485 13.4 1.9 49.8 87.6 15.3 
25 8 44 .2651 0777 16.4 3.0 $1.7 84.5 19.4 
24 4 58 3494 .1068 21.6 41 55.8 84.0 25.7 
23 il 69 A1S7 .1553 25.6 5.9 58.0 81.3 31.5 
22 10 79 A759 1845 29.4 7.1 60.6 80.5 36.5 
21 il 90 5422 .2136 33.4 8.2 63.5 80.3 41.6 
20 16 106 6386 .2816 39.4 10.8 66.9 78.5 50.2 
19 115 6928 .3689 42.7 14.1 66.9 75.2 56.8 
18 123 .7410 4078 45.7 15.6 68.4 74.6 61.3 
17 1 136 .8193 4660 50.6 17.8 711 74.0 68.4 
16 143 8614 5048 53.1 19.3 72.1 73.3 72.4 
15 147 8855 6214 54.6 23.8 69.1 69.6 78.4 
14 150 .9036 11 .7282 55.8 27.9 66.2 66.7 83.7 
13 154 9277 8733 $7.2 33.5 62.0 63.1 90.7 
12 161 .9699 I 417 59.8 36.1 62.0 62.4 95.9 
11 163 .9819 | 9709 60.6 37.2 61.7 62.0 978 

=10 166 1.0000 | | 1.0000 61.7 38.3 61.7 61.7 100.0 


Wechsler-Bellevue Scatter and Schizophrenia 


we choose to make a diagnostic statement only 
for those patients identified as schizophrenic 
by our index, it can be seen that with a cut- 
ting score of 19 we make 75.2% correct diag- 
noses (Hp). In this case, however, we are 
able to offer a diagnosis for only 56.8% of 
the patients in our total sample (Rp). Within 
the limits of our procedure, as we increase our 
accuracy in making positive diagnoses, we pay 
the penalty of reducing the percentage of pa- 
tients about whom we can make diagnostic 
statements. Numerous other contingencies 
may be observed from Table 1. The reli- 
ability and generality of our “valid positive” 
and “false positive’ rates can only be deter- 
mined by cross-validation in a variety of 
clinical settings. Their associated diagnostic 
contingencies will vary as actual clinical base 
rates P and Q change. 


Summary 


A Wechsler-Bellevue scatter index based 
upon the sum of subtest deviations from the 
mean subtest score was computed for a sam- 
ple of schizophrenics, neurotics, and charac- 
ter disorders. A cutting score of 19 was used 
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as a basis for discriminating schizophrenics 
from neurotics and character disorders. The 
accuracy of discrimination was judged against 
the criterion of psychiatric diagnosis. The hy- 
pothesis of 60% correct diagnoses was re- 
jected in favor of the alternate hypothesis of 
70% correct diagnoses in a sequential analy- 
sis test. Data were presented showing other 
diagnostic consequences in our sample of pa- 
tients arising from the application of various 
scatter index cutting scores. 


Received April 25, 1957. 
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Validity of the Hewson Ratios: Investigation of a 
Fundamental Methodological Consideration”* 


Walter F. McKeever and Alvin I. Gerstein 


University of Rochester 


Hewson (2) has described a method for 
diagnosing brain pathology from ratios com- 
puted between Wechsler-Bellevue I weighted 
subtest scores. The effects of age and IQ 
variables on these ratios have never been as- 
sessed, and little validation data are available 
on the method. The present study aims to 
help fill these gaps. 

Ss were 50 schizophrenics and 52 organics. 
All were between 18-55 years old, had had 
no electroshock, had full W-B I tests, and 
had IQs of at least 80. Adequacy of diagnosis 
was partially assured by the requirement that 
each patient have at least two congruent, in- 
dependent diagnoses, and never have been 
diagnosed differently. The total groups were 
matched on age and IQ. These groups were 
also divided into those with IQs above and 
below 100, so that 30 schizophrenics and 30 
organics made up the 100+ groups, the re- 
mainders constituting the 99— groups. There 
were no differences within or between groups 
on age, and none on IQ across diagnostic 
groups. 

Over-all accuracy of Hewson diagnoses was 
54.9%, or essentially chance. Although the 
100+ groups were significantly differentiated in 
the correct direction, fully a third of the 100+ 
schizophrenics were misclassified as organics. 
In the 99— groups, actual organics were cor- 
rectly identified no better than chance, while 

1An extended report of this study may be ob- 

tained without charge from Walter F. McKeever, 
Department of Psychology, University of Rochester, 
Rochester, N. Y., or for a fee from the American 
Documentation Institute. Order Document No. ——, 
remitting $—— for microfilm or $—— for photo- 
co) 
The aid of B, F. McNeal and the staff of Canan- 
daigua VA Hospital, Canandaigua, N. Y., is grate- 
fully acknowledged. The study was done during the 
writers’ internship: at Canandaigua. 


75% of the schizophrenics were misclassified 
as organics, a finding approaching signifi- 
cance (p = .10}. Patients who performed or- 
ganically were significantly older and less in- 
telligent as a group than were those who did 
not (p = .01). Point biserial correlations be- 
tween organic performance and age and IQ, 
respectively, were .45 and — .34, both signifi- 
cant beyond the 1% level. The r between age 
and IQ was .09, nonsignificant. 

These results are generally inconsistent 
with the more positive ones reported by 
Gutman (1). Inspection of Gutman’s data 
suggests the possibility that her organics may 
have been significantly older and less intelli- 
gent than the controls, a situation favoring 
artifactually positive results. 

The following conclusions seem justified: 

1. Organic performance on the Hewson 
ratios varies systematically with age and IQ 
variables. Nearly a third of the variance of 
organic performance is associated with vari- 
ance in age and IQ. 

2. The method does not differentiate the 
groups studied beyond a chance level; among 
patients with IQs below 100 it makes at least 
as many errors as it does correct diagnoses. 

3. The need for control of logically rele- 
vant variables seems re-emphasized by the 
findings. 

Brief Report. 
Received December 2, 1957. 
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Brain Injury and Intellectual Performance 


Alan O. Ross* 
Clifford Beers Guidance Clinic, New Haven, Conn. 


The question whether intelligence deterio- 
rates after brain injury has been investigated 
in a number of studies whose results are con- 
tradictory (4). In connection with an investi- 
gation dealing with the effect of brain injury 
on tactual perception, this writer (3) gathered 
data relevant to intelligence test performance 
which, though not conclusive, indicate that 
brain injury may have a deleterious effect on 
the ability measured by such tests. 

Subjects (Ss) were 20 soldier patients with 
a mean age of 22.8 years who had undergone 
brain surgery within 12 months before the 
date of the study. Only those cases were in- 
cluded where surgery had involved at least an 
incision of the dura. Except for one tumor and 
two accident cases, all traumata had been due 
to gunshot or shellfragment injuries located in 
the frontal, temporal, or parietal regions. For 
each S, the score he had made on the Army 
General Classification Test (5), taken some 
time before he sustained his injury, was ob- 
tained from the soldier’s records, furnishing 
a measure of his premorbid intelligence. At 
the time of the study, each S was given the 
CVS Individual Intelligence Scale (1) which 
provided a measure of his postinjury func- 
tioning. The CVS consists of the comprehen- 
sion and similarities items of the Wechsler- 
Bellevue and a vocabulary scale based on the 
Stanford-Binet word list. The CVS correlates 
with a classification test simliar to the AGCT 
on the order of .80 (1), and in an analysis 
conducted by this writer, using 37 normal Ss, 
the AGCT and CVS were found to have an r 
of .75 (t = 6.61, p < .001). 

By transforming the AGCT and CVS scores 

1 At the time of this study the writer was a Ist 


Lt., MSC, US Army, stationed at Walter Reed Army 
Hospital. 


into deciles, it is possible to test for the sig- 
nificance of the difference between the pa- 
tients’ preinjury and postinjury scores, using 
the formula for correlated means presented by 
McNemar (2). The postinjury scores are sig- 
nificantly lower than the preinjury scores with 
a t of 3.50 (p < .01). This relationship can 
also be demonstrated by a matched group 
technique where the more highly discriminat- 
ing CVS standard scores and original AGCT 
scores can be utilized, thus eliminating the 
transformation into deciles. Each patient was 
matched individually with a normal, drawn 
from the service personnel of a major army 
hospital, controlling for AGCT score, age, and 
years of education. Each of the normals was 
given the CVS as part of the study. Table 1 
shows the result of the AGCT matching and 
the comparison of CVS scores for patients 
and normals. When patients are matched with 
normals on the basis of the patients’ preinjury 
scores, the patients have significantly lower 
scores when tested within 12 months after 
brain surgery. 

A converse comparison is possible by match- 
ing the patients with different normals using 
the criterion of patients’ postinjury (CVS) 
scores, again holding age and years of educa- 
tion constant. An analysis of the differences 
in AGCT scores for the matched pairs shows 
that the patients’ (preinjury) scores are sig- 
nificantly higher with a ¢ of 3.23 (p < .01). 

These results, though statistically signifi- 
cant, provide only presumptive indication that 
intelligence test performance deteriorates as 
the result of certain forms of brain injury. A 
more direct test of this question would have 
been to readminister the AGCT to all patients 
postoperatively, but this was unfortunately 
not possible due to the procedure dictated by 
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Table 1 References 


Intelligence Test Scores of Patients and Normals 1. Hunt, W. A., & French, Elizabeth G. The CVS 
(N = 20 pairs) abbreviated individual intelligence scale. J. 
He consult. Psychol., 1952, 16, 181-186. 
Mean tfor 2. McNemar, Q. Psychological statistics. New York: 
Mean Wiley, 1949. 
Test Normals Patients Difference 3. Ross, A. O. Tactual perception of form by the 
brain-injured. J. abnorm. soc. Psychol., 1954, 
AGCT 59.0 59.6 62 >.50 49, 566-572. 
CVS 31.4 28.4 2.50 <.05 4. Ross, A. O. Integration as a basic cerebral func- 
tion. Psychol. Rep., 1955, 1, 179-202. (Monogr. 
Suppl. No. 2.) 
the primary study (3) to which these findings 5. U. S. War Department, Adjutant General’s Office, 
are incidental. Classification and Replacement Branch, Pers. 
Res. Sec. The army general classification test. 
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The Clinical Usefulness of the Archimedes Spiral 
in the Diagnosis of Organic Brain Damage"’ 


Lewis R. Goldberg 


University of Michigan 


and Philip A. Smith 
Ann Arbor VA Hospital 


In reporting their promising results using 
the spiral aftereffect to differentiate groups of 
patients with cortical involvement from non- 
organic control groups, Price and Deabler 
concluded that “. . . the results are such as 
to justify present clinical use of this tech- 
nique in diagnosis of organicity” (4, p. 302). 
They found almost no overlap in the per- 
formance of their groups: fewer than 8% 
of normals or functional psychiatric patients 
were misclassified as organic and only 2% of 
their organic group was incorrectly classified. 
While they reported no data on such vari- 
ables as age, which might have contributed 
to their results, the description of the syn- 
dromes included in the organic group suggest 
that this group was considerably older than 
the normals or psychiatric patients. However, 
later studies by Gallese (2) and Page et al. 
(3) showed no significant age-score relation- 
ship, and offered some additional evidence 
for Price and Deabler’s original assumption. 
Gallese, however, noted that the spiral test 
was relatively insensitive with certain types 
of organic patients, notably those with con- 
vulsive disorders or brain syndromes associ- 
ated with alcoholism. The performance of 
twelve lobotomized schizophrenics also was 


1The authors wish to thank E. Lowell Kelly of 
the University of Michigan for his critical reading of 
this manuscript, and express gratitude to the Psy- 
chiatric Research Committee of the Ann Arbor VA 
Hospital for encouragement and assistance in this 
study. 

2 The statements and conclusions of the authors do 
not necessarily reflect the opinions or policy of the 
Veterans Administration. 


indistinguishable from that of nonlobotomized 
schizophrenics and normals. Page et al. sup- 
ported Price and Deabler’s theoretical as- 
sumptions but their study provided “.. . 
less evidence that the aftereffect may serve 
as an effective diagnostic device” (4, p. 91). 
These investigators also noted unimpaired 
performance on the part of some lobotomized 
schizophrenics. They did not report separate 
data for patients with convulsive disorders. 
A study by Standlee (5) of psychiatric pa- 
tients tested before and after electric shock 
treatment (which generally produces some 
temporary organic dysfunction) was incon- 
clusive because the effect of prior experience 
with the spiral was not controlled. 

During the course of their clinical practice 
at the Veterans Administration Hospital, Ann 
Arbor, Michigan, the present writers utilized 
the Archimedes spiral as a part of their test 
battery in examining patients referred for 
psychological evaluation. The present paper 
reports on findings over a period of one year. 


Method 


Apparatus. The instrument used in this 
study was a 78 rpm phonograph turntable, 
vertically mounted on an 8 X 14 inch black 
box which housed the motor. One clockwise 
and one counterclockwise Archimedes (or 
Plateau) spirals, each 24 circuits (920 de- 
grees) about the center, were painted in 
black on 10-inch white cardboard discs, 
which could be affixed to the turntable. This 
instrument differed slightly from that of 
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Table 1 
Diagnostic Classification of Subjects 


Subject group Diagnosis 


No pathology 
Schizophrenia 
Psychoneurosis 
Psychophysiologic reaction 


Schizophrenia 
Psychoneurosis 


Normal 
Psychiatric 


Cerebra! 
Arteriose 
Convulsive disorder 
Cerebral! neoplasm 
Encephalomyelitis 
CNS lues 

Paralysis agitans 


Dosis 


previous investigators both in speed of rota- 
tion and diameter of the spiral. Others (1, 2, 
3, 4) have reported using spirals varying 
from 3} inches to 8 inches in diameter and 
at speeds varying from 78 to 100 rpm. 
Procedure. The following procedure was 


standard for all subjects (Ss): S was seated 
approximately 8 feet from the instrument in 
an office with good illumination. The spirals 
were presented in ABBA order, with the in- 
terval between successive trials varying with 
the duration of reported after-image. Each 
trial was for 30-seconds’ duration. Instruc- 
tions given to Ss were essentially the same as 
those used by Price and Deabler: “This is a 
special eye test. Look at the center (point- 
ing) and don’t take your eyes away until I 
tell you to.” Approximately ten seconds after 
rotation was begun, S was asked, “What does 
the line appear to be doing?”’ After the spiral 
had been braked to a stop, the last instruc- 
tion was repeated and S’s response was re- 
corded. 

Subjects. The population from which Ss 
were selected included both staff and patients 
from the psychiatric and neurological wards 
of the Ann Arbor VA Hospital (Table 1). 
Patients classified as organics were those 
with mild to grossly limiting cortical dysfunc- 
tion, confirmed by independent neurological 
diagnosis. Patients with questionable or mar- 


ginal organic dysfunction were not included 
in the present analysis, except for a group of 
eleven psychiatric patients who were seen 
one to seven days following a course of four 
or more electro-shock treatments (EST). 
These were presumed to show a possible 
temporary organic deficit due to the effects 
of EST. The remaining psychiatric patients 
were untreated by EST and were seen dur- 
ing the acute fulminating stage of their ill- 
ness, shortly after admission to the hospital. 
The normal control group included ward phy- 
sicians, psychiatrists, psychologists, nurses, 
secretaries, and students employed at the 
hospital. 

Scoring. Responses from all Ss were re- 
corded verbatim and scored according to the 
criteria of Price and Deabler and also those 
proposed by Gallese. The latter method, 
which gave equal credit for a description of 
the aftereffect either as a change in apparent 
size or direction of movement, was judged the 
most applicable for the present analysis, 
since a large proportion of responses was 
phrased as some combination of these terms. 
Working independently from a sample of 38 
verbatim protocols, the authors agreed per- 
fectly in this manner of scoring for 35 Ss. 
Since four partial scores contributed to the 
total score for each S, this represented a 
total of 152 sets of independent judgments 
for which the scorers agreed 146 times 
(96%). 


Results 


Table 2 summarizes the findings for all 
groups. The normal Ss were superior to all 
others, attaining, without exception, perfect 
scores. Because of the absence of within- 


Table 2 
Spiral Aftereffect Scores 


Mean score 
adjusted 


for age 


Mean 


Subject group score 


Normal 
Psychiatric 
Post-EST 
Organic 


4.00 
2.94 2.72 
2.55 2.45 
2.17 2.37 
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N 
8 
5 
1 
Organic 7 
5 
5 
3 
2 
1 
1 
Mean 
29.3 
35.7 
38.5 
45.0 
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Table 3 


Distribution of Aftereffect Scores 
(after Price and Deabler) 


Score 


Subject group 0 2 


0% 
11% 
18% 
33% 


Normal 
Psychiatric 
Post-EST 
Organic 


0% 
18% 
27% 
13% 


group variability for the normal Ss, usual 
parametric tests of significance were inap- 
propriate. Accordingly, a chi-square value 
was obtained for the difference between nor- 
mals and the combined patient groups. When 
this was found to be significantly large ( x? 
= 23.50, p< .001), the normals were ex- 
cluded from subsequent statistical tests. An 
analysis of covariance for spiral scores of the 
remaining groups, adjusted for age, was not 
Statistically significant (F <1). Examina- 
tion of the adjusted mean scores, presented 
in Table 2, shows that the organics neverthe- 
less maintained the low order position among 
the groups. 

The Pearson product-moment correlation 
between age and spiral score was — .39. For 
32 patients for whom IQ measures were avail- 
able, the correlation between intelligence and 
spiral score was small and not significant (r 
=.11). 

Because these findings seemed so incom- 
patible with the results of previous studies, 
the data were arranged according to the pro- 
cedures followed by earlier investigators and 
new comparisons made. Table 3 shows the 
distribution of scores following Price and 
Deabler’s original presentation. The patients 
were then split into those whose scores were 
2 or under and those whose scores exceeded 
2, after Gallese. Table 4 summarizes this 
analysis and shows the percentage of patients 
who were correctly classified as “organic” or 
“normal” by this method. Table 4 also shows 
the results of this breakdown for patients 
classed as convulsive disorder or “other 
types” of organic dysfunction. The present 
figures contrast sharply with those reported 
by Price and Deabler and Gallese. 


Discussion 

These findings cast considerable doubt as 
to the utility of the spiral aftereffect for dif- 
ferential diagnosis of organicity. Although 
Price and Deabler’s results appeared so 
promising, later work (2, 3) made their find- 
ings look more conditional. The present find- 
ings add an even more compelling note of 
caution to the use of the instrument as a 
method of diagnosis. This seems especially 
important since all the published studies used 
clear-cut cases of cortical dysfunction, while 
in clinical practice, neurological involvement 
is often marginal or partial and thus the 
diagnostic problem might be expected to be 
even more difficult. 

The organics used in the present study 
were all drawn from a group of recent ad- 
missions to a general medical and surgical 
hospital, and one might hypothesize that such 
patients tend more likely to show acute rather 
than chronic deteriorative effects of cortical 
destruction. Patients exhibiting chronic symp- 
toms are more typically found in the VA 
neuropsychiatric hospitals and state institu- 
tions where the earlier studies were con- 
ducted. This might be one reason why the 
present data differs from previous findings. 
For the same reason, however, the present 
psychiatric groups did not include chronic, 
deteriorated schizophrenics, and it is difficult 
to understand how so large a proportion of 
these relatively intact patients failed to at- 
tain perfect scores. Our instructions and scor- 
ing procedures were identical to those of ear- 
lier investigations although some differences 
may have arisen from the manner in which 


Table 4 


Classification of Subjects According to 
Gallese Criteria 


Organic Normal 
score score Correctly 
Subject group (2orunder) (3orover) classified 


Normal 30 
Psychiatric il 
Post-EST 6 
Organic 12 
Convulsive 
disorder 3 
Other types 9 
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3 4 
0% 0% 100% 
6% 6% 539% 
0% 18% 37% 
4% 13% 37% 
= 
65% 
50% 
40% 
53% 
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inquiries were conducted. Gallese noted that 
many schizophrenics would have been scored 
“organic” had not a detailed, direct inquiry 
been performed, and Page et al. reported a 
need for an objective nonverbal response 
measure for use with the spiral. Perhaps a 
multiple choice response situation such as 
that offered by Freeman and Josey (1) would 
insure more uniform results. 

Many patients, both organic and psychi- 
atric, suffered from disturbances in their 
ability to communicate effectively. Thus, al- 
though the spiral as originally presented ap- 
peared to be a neat, objective evaluative tech- 
nique, in actual practice a large degree of 
clinical acumen may be necessary to deter- 
mine an S’s exact perception. The less con- 
tact the patient has with reality (whether 
functional or organic in origin) the less is 
the likelihood that correct inferences regard- 
ing his perception can be made. Caution must 
be observed in interpreting the remarks of 
negativistic patients, as well as overly defen- 
sive persons who wish to deny any experi- 
ence which they perceive as a malfunction. 
Some patients expressed fear that reporting 
the aftereffect was tantamount to admitting 
a delusion, and one was concerned lest he be 
hypnotized unknowingly. Other patients, de- 
spite repeated instructions, responded to the 
spiral as a kind of projective technique, re- 
porting such phenomena as “it looks like a 
coiled snake, ready to strike,” and “it is a 
ram’s horn, all curled up.” The present au- 
thors do not imply that these cautions are in 
any way unique to the spiral technique but, 
rather, that this test shares with many other 
diagnostic instruments the difficulties en- 
countered in evaluating verbal report. 

It would appear also that variables such as 
the rate of rotation and size of the spiral, 
level of illumination, and other test condi- 
tions might well have some influence on re- 
sults. For example, it was noted in prelimi- 
nary work that the perception of movement 
toward or away from S was in part a func- 
tion of the distance between the exact center 
of the turntable and the placement of the 
spiral upon it. When the spiral was posi- 
tioned so that it turned about its exact cen- 
ter, then much less perception of changing 
distance occurred. Whether such variables 
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have a differential effect on different popu- 
lations is unknown. 

The one variable in this study which did 
seem related to ability to perceive the effect 
was age. Moreover, Freeman and Josey found 
marked differences in ability to see the after- 
effect in a population of psychotic patients, 
and although they related this deficiency to 
memory impairment, their data would indi- 
cate that age may have been a factor. In 
both the Price and Deabler and Gallese stud- 
ies, the organic groups appeared substantially 
older than their nonorganic controls. 

There is a discrepancy between the present 
results and those of Standlee in regard to 
post-EST patients. Standlee reported that 23 
out of 25 patients perceived the aftereffect 
on a retest 8 hours after one EST. In the 
present post-EST groups, scores ranged from 
O to 4 and were little correlated with clinical 
impression of “organicity.” As has been sug- 
gested elsewhere (4), practice effects may 
have contaminated Standlee’s results; more- 
over, the discrepancy between the two studies 
was probably even more heightened by the 
fact that all Standlee’s Ss had considerably 
fewer shock treatments than the present ones. 
On the other hand, in the present study there 
was no apparent correlation between after- 
effect score and number of EST or length of 
time since the last treatment. 

One of the paramount problems in all stud- 
ies utilizing the spiral has been the difficulty 
in establishing a satisfactory set of criteria of 
organic brain damage. In most studies, an im- 
plicit assumption is made that localization, 
chronicity, and degree of neurological de- 
struction are irrelevant variables, as well as 
such factors as the optical efficiency of the 
S and the pharmacological effect of the drugs 
he has been administered. The fact that such 
a viewpoint is basically unsound has already 
been suggested (6). To date most studies, in- 
cluding the present one, confound these cru- 
cial variables and thus they cannot be effec- 
tively compared. 


Summary 


This study is an evaluation of the Archi- 
medes spiral as a clinical technique for the 
diagnosis of organic brain damage. Over a pe- 
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riod of a year’s work with the instrument the 
authors noted that normal Ss report perceiv- 
ing the aftereffect of both expanding and 
contracting spirals without a single instance 
of failure. Psychiatric, post-EST, and organic 
patients—in respective order—performed with 
decreasing efficiency on the same task. When 
the scores for the groups were adjusted for 
age, however, the differences between the 
latter three groups became statistically indis- 
tinguishable. The correlation between age and 
spiral score was — .39. 

In spite of early enthusiasm for the spiral, 
the present results warn against its indis- 
criminate use for differential diagnosis. 


Received June 1, 1957. 
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Identification of Item Factor Patterns 
Within the Manifest Anxiety Scale’ 


A. W. Bendig 
University of Pittsburgh 


O’Connor, Lorr, and Stafford (2) have re- 
ported a factor analytic study of the item con- 
tent of Taylor’s Manifest Anxiety Scale. Tet- 
rachoric correlations were computed among 
42 MAS items (omitting eight items with 
marginal frequencies of less than 10 per cent) 
for a sample of 220 undergraduate Ss. Five 
oblique factors were extracted and described 
in terms of the apparent content of items 
highly loaded for each factor. No external 
validation of the factors was attempted. 

It has been shown (1) that MAS scores 
correlate highly (r = .76) with scores from 
the Neuroticism (N) scale contained in 
Eysenck’s Maudsley Personality Inventory 
(MPI) and are less highly related (r = — 
.34) to the MPI Extraversion (E) scale. This 
suggests that some of the MAS factor pat- 
terns found by O’Connor, Lorr, and Stafford 
may be identified as “extraversion” or “neu- 
roticism” factors within the multidimensional 
MAS items. Since the factor analytic study 
used mixed male and female Ss it is possible 
that one or more of the factors obtained 
might be a “sex” factor. 

Both the MAS and MPI inventories were 
administered to 145 Ss (100 men and 45 
women) enrolled in introductory psychology. 
Each of the 42 MAS items was correlated 
(biserial-phi) with the sex criterion and these 
coefficients were correlated (product-moment) 
with the item factor loadings reported by 
O’Connor, Lorr, and Stafford. None of these 
five correlations (one for each factor) were 
significant. 

The Ss were dichotomized at the medians 

1An extended report of this study may be ob- 
tained without charge from A. W. Bendig, Dept. of 
Psychology, University of Pittsburgh, Pittsburgh 13, 
Pa., or for a fee from the American Documentation 


Institute. Order Document No. 5466, remitting $1.25 
for microfilm or $1.25 for photocopies. 


of the distributions of their E and N scores 
and each of the 42 MAS items was corre- 
lated (tetrachoric) with total scores on these 
two MPI scales. The E criterion item validi- 
ties were then correlated with each of the five 
sets of item factor loadings and a similar 
procedure was followed with the N criterion 
item validities. The item validities for the E 
scale criterion were significantly related (.01 
level) to the factor loadings on Factor D (r 
= —.51), but were not significantly corre- 
lated with the other four factors (r = .06, .01, 
OS, and — .04). The N scale item validities 
were significantly correlated with factor load- 
ings on Factors A, B, and D (r = .38, — .49, 
and .35), but were not related to Factors C 
and E (r = .23 and .25). The E and N cor- 
relations with Factor A (r= .06 and .38) 
were significantly different at the .01 level 
(F = 7.94) as were the E and N correlations 
with Factor B (r= .01 and — 49, F= 
11.12). The E and N correlations with Factor 
D (r = — .51 and .35) were not significantly 
different (F = 1.06) because of the negative 
correlation (r = — .42) between the E and N 
criteria validities of the 42 MAS items. 
Factors A and B found by O’Connor, Lorr, 
and Stafford (2) appear to be similar to the 
“neuroticism” factor identified by Eysenck, 
while Factor D seems to be a combination of 
Eysenck’s “neuroticism” and “introversion”’ 
factors. 
Brief Report. 
Received November 12, 1957. 
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